SOURCE OF UNALIGN.ED ChIP-Seq READS
View/ Open
Date
2009Author
Ouma, Wilberforce Zachary
Type
ThesisLanguage
enMetadata
Show full item recordAbstract
Chromatin Immunoprecipitation with sequencing (ChIP-Seq) is an indispensable
tool in understanding the dynamics and evolution of regulatory
circuitry of prokaryotes and eukaryotes by mapping genomewide
transcription factor binding sites (TFBSs). Aligning short sequence
reads to the reference genome is the first step in the ChlP-Seq
data analysis pipeline. Signicantly low alignment proportions would
therefore have a negative impact on the identication of TFBSs and
thereby undermine the process of deciphering true-.-gene regulatory
networks (GRNs). -i ;
Source of unaligned reads ChIP-Seq studies has never been explored.
This study employed a computational approach in determining
source of unaligned reads from major model organisms: Arabidopsis
thaliana, Homo sapiens, Drosophila melanogaster, Caenorhabditis
elegans; and Zea mays. The analysis of raw sequence r•eads obtained
from the National Center for B10te~hnology Information (NCB!) short
read archive (SRA) revealed a signicant level of contamination in
ChIP-Seq unaligned reads with sequences of bacterial and metazoan
origin, irrespective of the .source of chromatin used for the ChlP-Seq
studies. In agreement with other sequencing studies, results reported
herein indicate that human sequences are the main source of contamination.
Unexpectedly, however, was the observation that selected
unaligned reads data sets contained significant numbers of legitimate
reads that have mappable properties, but were missed out in the alignment
process. This highlights a need to improve the currently utilized
Citation
Master of Science in BioinformaticsPublisher
University of Nairobi Center for Biotechnology and Bioinformatics, University of Nairobi