Automatic characterization of named entity relational facts in unstructured incident reports
Abstract
Natural language provides many different ways of expressing facts. These facts can either be
explicit facts or implicit facts. Explicit facts could be in the form of entity relations expressed in a
single sentence. Many organizations own document corpuses that take the form of unstructured
Incident Reports, which contain explicit facts. A key challenge faced by these organizations is
finding out how two named entities contained in a unstructured Incident Report corpus are related
to each other; a reading problem.
In this research we conceptualized the problem as a composition of two sub problems; relational
extraction and relational representation. We used Open Information Extraction tools and
techniques to extract Entity Relational facts; a dictionary of named entities and a greedy
algorithm to tag and characterize the extracted facts and graph algorithms to search through the
extracted facts to determine the interrelationship between two (2) named entities in a Test corpus
of ten (10) documents covering Politics, Accidents and Poaching.
We came up with a model that harmonizes relation extraction and representation, which was able
to address the key challenge of being able to determine how two named entities are interrelated in
a unstructured Incident Report corpus.
From experiments conducted using a prototype application developed based on the model above
it was observed that: the quality of the text corpus, the choice of the underlying POS tagger and
English dictionary, the character and size of Named Entity Dictionary and a mechanism to enable
document level named entity resolution are key issues that have to be addressed when building a
Entity Relation Characterizer.
The model developed is a useful tool that can guide in the development of systems that collate
information containing named entity relational facts from different sources, addressing the issue
of information incoherence within organizations.
Citation
Masters of science in computer sciencePublisher
University of Nairobi School of Computing and Informatics