A statistical linguistics approach to a Swahili literacy syllabus design
Kigamwa, James C
MetadataShow full item record
The study developed ten literacy lessons in Swahili that are based on a statistical linguistic analysis of lexicographic data and running texts of varied genre. The study considered 25,271 words of varied genre texts from media sources, narratives, technical writings, and religious texts, and 9,314 words derived from lemma entries from the Standard Swahili-English Dictionary. The two texts were analysed separately. The first analysis ranked words from the varied genre text on the basis of their frequency of occurrence in the running texts to generate a Word Frequency List. The second analysis which focused on the lexicographic data generated a Phoneme Frequency Rank List. The two frequency lists were used in identifying frequently occuring words and highly ranked phonemes which were used in developing initial lessons of a literacy curriculum. The study was confined to the top 22 words in the Word Frequency List which were found to account for 25% of all the word: tokens in running texts. These words were used to develop ten lessons for teaching them. The Phoneme Frequency list generated from lexicographic data served as a basis for determining the order of introduction of phonemes in the creation of lessons. The study was done in order to produce a literacy learning curriculum that is based on the statistical facts of the texts in Swahili. The study utilized the concepts of word tokens and word types in the analysis of the running texts and found that there were 6,619 distinct word !ypeJ in the 25,271 word taeens. The study shows that by a, careful choice of words to be taught in a learning curriculum, those developing basic literacy material can be sure of what percentage of existing texts their learners are enabled to read as they proceed through their lessons.