Natural language access to relational databases:an ontology concept mapping (OCM) approach

Githiari, Lawrence M

dc.contributor.author	Githiari, Lawrence M
dc.date.accessioned	2014-09-03T06:06:49Z
dc.date.available	2014-09-03T06:06:49Z
dc.date.issued	2014
dc.identifier.citation	School of Computing and Informatics,	en_US
dc.identifier.uri	http://hdl.handle.net/11295/74001
dc.description.abstract	Many applications that are accessed by non-technical or casual users, who prefer the use of natural language, rely on relational databases. Examples of such applications include government data repositories such as government tender information portals or application specific databases such as agricultural support systems. The problem of natural language (NL) processing for database access which has remained an unresolved issue forms the main problem addressed in this work. The specific challenges include lack of a language- and domain-independent methodology for understanding un-restrained NL text that accesses monolingual of cross-lingual databases as well as concepts extraction from database schema. It is demonstrated that an ontology based approach is technically feasible to handle some of the challenges facing NL query processing for database access. The Ontology Concept Modelling (OCM) approach relies on the ability to convert databases to ontologies from which we obtain the underlying concepts. The database concepts are matched against the concepts obtained from natural language queries using a semantically-augmented Levenshtein distance algorithm. This thesis presents the architecture and the associated algorithms for an OCM-based model for NL access to databases. In order to evaluate and benchmark the OCM model, data was generated from a prototype based on the developed OCM-based model. Quantitative parameters such as accuracy, precision, recall and the F-score and qualitative measures such as domain-independence, language independence, support for cross-lingual querying and the effect of query complexity on the model were evaluated across five data sets. Studies were conducted for English, Kiswahili and English-Kiswahili pair of languages in a cross-lingual manner from which attainment of language and domain independence for database access are demonstrated. For this language pair, it is also shown empirically that it is adequate to incorporate a bilingual dictionary at gazetteer level for cross-lingual data retrieval. To evaluate the performance of the developed OCM-model, test-beds comprising of monolingual, cross-lingual as well as cross-domain performance measurements capacity were designed to test various aspects of the model. Tests were then conducted and the results indicated that OCM has a marginally better precision of 0.861 compared to other benchmarking models selected for comparison. Further OCM has an average F-score of 0.78 which compares well to other bench-marking models. The main contribution of this work especially on the OCM architecture, processing algorithms such as OWoRA (Ontology Words Recovery Algorithm) and Frameworks such as QuSeT (Query semantics transfer framework) and evaluation models have a huge significance to the research and developer communities as they provide novel approaches to NL database access and model evaluation techniques. Keywords: Natural Language Query, Database Access, Ontology Concept Modeling	en_US
dc.language.iso	en	en_US
dc.publisher	University of Nairobi	en_US
dc.title	Natural language access to relational databases:an ontology concept mapping (OCM) approach	en_US
dc.type	Thesis	en_US
dc.type.material	en_US	en_US

Files in this item

Name:: Githiari_Natural Language Access ...
Size:: 7.359Mb
Format:: PDF
Description:: Full-text

View/Open

This item appears in the following Collection(s)

Faculty of Science & Technology (FST) [4089]

Show simple item record