dc.description.abstract | All documents are made up of a set of components. Take a case of documents composed
in natural languages such as English. The document consists of sentences which are
identified by full stops. Each time we come across a full stop when reading the document
we know we have reached the end of a sentence. A number of sentences are grouped to
form a paragraph, which would be identified by either a space between two lines or
indentation. Full stops, indentations and many other such like indicators mark the
beginning and the ending of documents' components.
This concept of documents structuring has been extended by mark-up languages, which
use tags to mark specified components of a document. Tags give the author flexibility to
use user-friendly tags to mark a document. Examples, all headings can be marked by tag
<heading> ; all paragraphs can be marked by tag <paragraphs> etc. Structured
Generalized Mark-up language (SGML)is an ISO Standard [1] for description of
marked-up electronic text. Once a document is tagged, well defined processes can be
applied on it, such as extraction of all components tagged with <paragraph> tag,
gathering of all components tagged with <heading> and listing them on the first page of
the document to form a table of content. Document Style Semantics and Specification
Language (DSSSL) is an ISO Standard [2] for associating processing with SGML
documents. It is a functional programming language with syntax very similar to that of
Lisp or Scheme programming languages.
In this project, a typical computer science notes document is marked with SGML. Then
using DSSSL the document is processed. The main pth::esses applied are: transformation
of class notes SGML intoa HTML document, querying of the document for some
specified components and gathering some components of the document( specifically
gathering of headings and terms components into the table of content section and
appendix section respectively).
The resulting system can be utilized by Institute of computer science lecturers in
compiling handouts. This would make it easier for them to circulate the specific
components of the notes to the students via internet; it would give them flexibility to
chose the form material presentation, in form of HTML documents, printed documents or
plain text; produce notes which are independent of any proprietary editor; notes which
can be opened in any machine configuration; it would also be easier to edit the notes
compiled in SGML. | en |