A Statistical Linguistics Approach to a Swahili Literacy Syllabus Design
Abstract
The study developed ten literacy lessons in Swahili that are based on a
statistical linguistic analysis of lexicographic data and running texts of varied
genre. The study considered 25,271 words of varied genre texts from media
sources, narratives, technical writings, and religious texts, and 9,314 words
derived from lemma entries from the Standard Swahili-English Dictionary. The two
texts were analysed separately. The first analysis ranked words from the varied
genre text on the basis of their frequency of occurrence in the running texts to
generate a Word Frequency List. The second analysis which focused on the
lexicographic data generated a Phoneme Frequency Rank List. The two
frequency lists were used in identifying frequently occuring words and highly
ranked phonemes which were used in developing initial lessons of a literacy
curriculum. The study was confined to the top 22 words in the Word Frequency
List which were found to account for 25% of all the word: tokens in running
texts. These words were used to develop ten lessons for teaching them. The
Phoneme Frequency list generated from lexicographic data served as a basis for
determining the order of introduction of phonemes in the creation of lessons.
The study was done in order to produce a literacy learning curriculum that is
based on the statistical facts of the texts in Swahili. The study utilized the
concepts of word tokens and word types in the analysis of the running texts and
found that there were 6,619 distinct word !ypeJ in the 25,271 word taeens. The
study shows that by a, careful choice of words to be taught in a learning
curriculum, those developing basic literacy material can be sure of what
percentage of existing texts their learners are enabled to read as they proceed
through their lessons.
Publisher
University of Nairobi, Department of Linguistics and African Languages
Description
Master of Arts in Linguistics