Show simple item record

dc.contributor.authorLeroy, Mwanzia
dc.date.accessioned2023-11-20T06:46:04Z
dc.date.available2023-11-20T06:46:04Z
dc.date.issued2023
dc.identifier.urihttp://erepository.uonbi.ac.ke/handle/11295/164058
dc.description.abstractNamed Entity Recognition (NER) is important in fields where researchers have to review large amounts of scientific text, such as plant pathology. However, NER is especially difficult in low-resource domains, for example, domains with little annotated textual data. Roots, Tubers and Bananas (RT&B) crop disease monitoring is one such domain. This paper investigates the promise of transfer learning to enhance the effectiveness of NER in the identification of RT&B crop disease entities. There is an increasing number of Pretrained Large Language Models (PLLMs) that have demonstrated better performance in Natural Language Processing (NLP) tasks. This study uses transfer learning to train new models for RT&B crop disease NER. It proposes a method for transferring knowledge from large language models in resource-rich domains to smaller, lowresource domains. By creating scientific workflows to quickly train the growing number of PLLMs and evaluate them using key metrics including non-O accuracy and the F1 score. This research demonstrates the effectiveness of transfer learning in creating effective models for RT&B crop diseases. The final model, based on SciDeBERTa, outperforms the baseline model on all metrics, especially on non-O accuracy. The results underscore the huge potential of this approach in the surveillance of crop diseases. This research makes a contribution towards more effective Named Entity Recognition in low-resource domains. It explores current advancements in NER and the use of transfer learning in these domains. The author acknowledges the limitations of the study, such as the lack of extensive hyperparameter tuning and the unknown nature of the generalisability of the models. Finally, the study proposes continuous benchmarking of new PLLMs, comprehensive hyperparameter tuning, and exploration of data augmentation techniques to improve data availability and impact of this innovative approach as further research opportunities.en_US
dc.language.isoenen_US
dc.publisherUniversity of Nairobien_US
dc.rightsAttribution-NonCommercial-NoDerivs 3.0 United States*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/us/*
dc.titleEnhancing Named Entity Recognition in Low Resource Domains Using Deep Transfer Learning: a Case of Rt&b Crop Diseases in Scientific and Online Texten_US
dc.typeThesisen_US


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record

Attribution-NonCommercial-NoDerivs 3.0 United States
Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 United States