Author: | Chen, Bo |
Title: | A self-adaptive spectral rotation approach to detection of DNA sequence periodicities and their relationship with molecular mechanisms |
Degree: | Ph.D. |
Year: | 2011 |
Subject: | Nucleotide sequence. DNA -- Analysis. Hong Kong Polytechnic University -- Dissertations |
Department: | Department of Industrial and Systems Engineering |
Pages: | xix, 189 leaves : ill. (some col.) ; 30 cm. |
Language: | English |
Abstract: | Computational investigations into the relationship and interaction between DNA sequences and cell components help biologists and medical scientists to address many important issues, such as diagnosis of gene-related diseases, medicine development, protein design, and so on. This study initiates a new approach, namely, Self-Adaptive Spectral Rotation (SASR), to investigate the relationship between periodicities in DNA sequences and various molecular mechanisms in cells, including genetic coding and nucleosome formation. This newly developed approach could be very useful in fields of bioinformatics, such as protein-coding region prediction and nucleosome positioning prediction. Protein-coding region prediction, especially computational methods to find locations of protein-coding regions in uncharacterized DNA sequences, is a meaningful issue in computational molecular biology. In this study, the SASR approach is first developed to visualize a coding related feature, i.e., the Triplet Periodicity (TP) or 3bp (base pairs) periodicity, in DNA sequences. Applications on real genomic datasets show that, in SASR's output, the graphic patterns for coding and non-coding regions differ so significantly that the former can be visually distinguished from the latter. Such visualization by the SASR approach requires no training process, and takes the advantage of "auto-scale analysis ability" from human vision. However, as a visualization method, the SASR approach does not provide exact numerical predictions. Therefore, a T-Z-T approach is developed to extract numerical information from the SASR's graphic result. The combination of the SASR and the T-Z-T provides computational predictions of coding regions without any training process. Moreover, the predictions from this SASR based approach are more robust than those from commonly used methods based on Hidden Markov Model (HMM), since this new approach is not sensitive to input errors contained in DNA sequences. Experimental studies on nucleosome positioning have revealed the preference of nucleosome binding for certain regions of a DNA sequence. However, it is still not clear whether or not such a binding preference is sequence-specific. Therefore, the study on the relationship between sequence features and nucleosome formation is of great significance. A major concern in this issue is the ~10bp periodicity property in DNA sequences, which appears to be associated with the structure of DNA helixes and the formation of nucleosomes. In this study, the original SASR approach is extended to investigate the relationship between nucleosome formation and the ~10bp periodicity of dinucleotides in DNA sequences. A Genetic Algorithm (GA) based method is developed to identify which dinucleotide combination mostly connects its ~10bp periodicity with nucleosome formation. The results from the GA support the "sequence-specific" argument of nucleosome formation. Meanwhile, they also suggest that some dinucleotides connect their ~10bp periodicity with nucleosome formation only in some local regions. Moreover, the ~10bp periodicity of dinucleotides is associated with not only the occurrence of nucleosome formation, but also the binding preference for the phase in the ~10bp period. Besides the TP and the ~10bp periodicity, some other unknown periodicity properties may also be contained in DNA sequences, and may have some connections with some important molecular mechanisms. Investigations of new periodicity properties might help with the computational studies of sequence-specific molecular mechanisms in organisms. In this study, another extension of the SASR approach, i.e., the mature SASR, shows its ability to detect a hypothetical anti-TP property in DNA sequences. Some real DNA fragments are found with such an anti-TP property by using the mature SASR. However, the universality of this property in genomes and its biological interpretation need further investigations. |
Rights: | All rights reserved |
Access: | open access |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
b24625413.pdf | For All Users | 3.93 MB | Adobe PDF | View/Open |
Copyright Undertaking
As a bona fide Library user, I declare that:
- I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
- I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
- I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.
By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.
Please use this identifier to cite or link to this item:
https://theses.lib.polyu.edu.hk/handle/200/6295