Developing an Open-source Spell Checker for Dholuo using Hunspell language tool
Abstract
Computational morphology is an important first step in most natural language processing
tasks such as spellchecking and machine translation.
Computational morphology involves the study of internal structure of words i.e. how verbs
and nouns are generated from the root words called the stems. Tnthis work, we carry out a
computational analysis of Dholuo morphology. Major morphological components of the
prevalent Dholuo words are explored. These include tense, verbs and nouns derivation
In this work, we use the Hunspell tool to develop an open source Spellchecker in Dholuo.
The Hunspell tool is used to create two files namely the affix and the dictionary files. The
affix file enables the creation of all the rules involved in deriving the nouns and the verbs
from the root words. All the root words (stems) plus the appended rules are stored in the
dictionary file.
The developed spellchecker, is the first spellchecker for Dholuo.
The system developed achieves an acceptable representation of Dholuo morphology. It can
correctly classify Dholuo words with an accuracy rate of 0.814, precision rate of 0.917, recall
rate of 0.800 and coverage rate of 0.820.
Citation
M.Sc (Information Systems)Sponsorhip
University of NairobiPublisher
School of computing and informatics, University of Nairobi
Description
Master of Science Thesis