Show simple item record

dc.contributor.authorDe Pauw, G
dc.contributor.authorde Schryver, Gilles-Maurice
dc.contributor.authorWagacha, PW
dc.date.accessioned2013-06-21T09:20:25Z
dc.date.available2013-06-21T09:20:25Z
dc.date.issued2006
dc.identifier.citationLecture Notes in Computer Science Volume 4188, 2006, pp 197-204en
dc.identifier.urihttp://link.springer.com/chapter/10.1007/11846406_25
dc.identifier.urihttp://erepository.uonbi.ac.ke:8080/xmlui/handle/123456789/37322
dc.description.abstractIn this paper we present experiments with data-driven part-of-speech taggers trained and evaluated on the annotated Helsinki Corpus of Swahili. Using four of the current state-of-the-art data-driven taggers, TnT, MBT, SVMTool and MXPOST, we observe the latter as being the most accurate tagger for the Kiswahili dataset.We further improve on the performance of the individual taggers by combining them into a committee of taggers. We observe that the more naive combination methods, like the novel plural voting approach, outperform more elaborate schemes like cascaded classifiers and weighted voting. This paper is the first publication to present experiments on data-driven part-of-speech tagging for Kiswahili and Bantu languages in general.en
dc.language.isoenen
dc.titleData-Driven Part-of-Speech Tagging of Kiswahilien
dc.typeArticleen
local.publisherSchool of computing and informatics University of Nairobien


Files in this item

FilesSizeFormatView

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record