Show simple item record

dc.contributor.authorPauwy, Guy De
dc.contributor.authorde Schryverz, Gilles-Maurice
dc.contributor.authorLooy, Janneke van de
dc.date.accessioned2013-06-21T13:47:48Z
dc.date.available2013-06-21T13:47:48Z
dc.date.issued2012
dc.identifier.citationWorkshop on Language Technology for Normalisation of Less-Resourced Languages (SALTMIL8/AfLaT2012)en
dc.identifier.urihttp://tshwanedje.com/publications/BantuPOS.pdf
dc.identifier.urihttp://hdl.handle.net/11295/37634
dc.description.abstractRecent scientific publications on data-driven part-of-speech tagging of Sub-Saharan African languages have reported encouraging accuracy scores, using off-the-shelf tools and often fairly limited amounts of training data. Unfortunately, no research efforts exist that explore which type of linguistic features contribute to accurate part-of-speech tagging for the languages under investigation. This paper describes feature selection experiments with a memory-based tagger, as well as a resource-light alternative approach. Experimental results show that contextual information is often not strictly necessary to achieve a good accuracy for tagging Bantu languages and that decent results can be achieved using a very straightforward unigram approach, based on orthographic featuresen
dc.language.isoenen
dc.titleResource-Light Bantu Part-of-Speech Taggingen
dc.typePresentationen
local.publisherCLiPS - Computational Linguistics Group University of Antwerp, Belgiumen
local.publisherSchool of Computing and Informatics, University of Nairobien


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record