Show simple item record

dc.contributor.authorNganga, W.
dc.date.accessioned2013-06-17T15:29:31Z
dc.date.available2013-06-17T15:29:31Z
dc.date.issued2005
dc.identifier.citationNganga, W. 2005. Word Sense Disambiguation of Swahili: Extending Swahili Language Technology with Machine Learning, 2005. : Helsinki University Pressen
dc.identifier.urihttp://erepository.uonbi.ac.ke:8080/xmlui/handle/123456789/35166
dc.description.abstractThis thesis addresses the problem of word sense disambiguation within the context of Swahili-English machine translation. In this setup, the goal of disambiguation is to choose the correct translation of an ambiguous Swahili noun in context. A corpus based approach to disambiguation is taken, where machine learning techniques are applied to a corpus of Swahili, to acquire disambiguation information automatically. In particular, the Self-Organizing Map algorithm is used to obtain a semantic categorization of Swahili nouns from data. The resulting classes form the basis of a class-based solution, where disambiguation is recast as a classification problem. The thesis exploits these semantic classes to automatically obtain annotated training data, addressing a key problem facing supervised word sense disambiguation. The semantic and linguistic characteristics of these classes are modelled as Bayesian belief networks, using the Bayesian Modelling Toolbox. Disambiguation is achieved via probabilistic inferencing.The thesisdevelops a disambiguation solution which does not make extensive resource requirements, but rather capitalizes on freely-available lexical and computational resources for English as a source of additional disambiguation information. A semantic tagger for Swahili is created by altering the configuration of the Bayesian classifiers. The disambiguation solution is tested on a subset of unambiguous nouns and a manually created gold standard of sixteen ambiguous nouns, using standard performance evaluation metrics.en
dc.language.isoenen
dc.publisherUNiversity of Nairobien
dc.titleWord Sense Disambiguation of Swahili: Extending Swahili Language Technology with Machine Learningen
dc.typeArticleen
local.publishercollege of biological and physical scienceen


Files in this item

FilesSizeFormatView

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record