SOURCE OF UNALIGN.ED ChIP-Seq READS

Ouma, Wilberforce Zachary

View/Open

Full text (2.006Mb)

Date

2009

Author

Ouma, Wilberforce Zachary

Type

Thesis

Language

Metadata

Show full item record

Abstract

Chromatin Immunoprecipitation with sequencing (ChIP-Seq) is an indispensable tool in understanding the dynamics and evolution of regulatory circuitry of prokaryotes and eukaryotes by mapping genomewide transcription factor binding sites (TFBSs). Aligning short sequence reads to the reference genome is the first step in the ChlP-Seq data analysis pipeline. Signicantly low alignment proportions would therefore have a negative impact on the identication of TFBSs and thereby undermine the process of deciphering true-.-gene regulatory networks (GRNs). -i ; Source of unaligned reads ChIP-Seq studies has never been explored. This study employed a computational approach in determining source of unaligned reads from major model organisms: Arabidopsis thaliana, Homo sapiens, Drosophila melanogaster, Caenorhabditis elegans; and Zea mays. The analysis of raw sequence r•eads obtained from the National Center for B10te~hnology Information (NCB!) short read archive (SRA) revealed a signicant level of contamination in ChIP-Seq unaligned reads with sequences of bacterial and metazoan origin, irrespective of the .source of chromatin used for the ChlP-Seq studies. In agreement with other sequencing studies, results reported herein indicate that human sequences are the main source of contamination. Unexpectedly, however, was the observation that selected unaligned reads data sets contained significant numbers of legitimate reads that have mappable properties, but were missed out in the alignment process. This highlights a need to improve the currently utilized

URI

http://erepository.uonbi.ac.ke:8080/xmlui/handle/123456789/43320

Citation

Master of Science in Bioinformatics

Publisher

University of Nairobi

Center for Biotechnology and Bioinformatics, University of Nairobi

Collections

Faculty of Science & Technology (FST) [4076]