Genomic mid-range inhomogeneity correlates with an abundance of RNA secondary structures.

TitleGenomic mid-range inhomogeneity correlates with an abundance of RNA secondary structures.
Publication TypeJournal Article
Year of Publication2008
AuthorsBechtel JM, Wittenschlaeger T, Dwyer T, Song J, Arunachalam S, Ramakrishnan SK, Shepard S, Fedorov A
JournalBMC Genomics
Volume9
Pagination284
Date Published2008
ISSN1471-2164
Keywords3' Untranslated Regions, 5' Untranslated Regions, Algorithms, Animals, Base Composition, Base Sequence, Chromosomes, Human, Pair 17, Computational Biology, DNA, Intergenic, Exons, Genome, Human, Humans, Introns, Molecular Sequence Data, Nucleic Acid Conformation, RNA Precursors, Thermodynamics
Abstract

BACKGROUND: Genomes possess different levels of non-randomness, in particular, an inhomogeneity in their nucleotide composition. Inhomogeneity is manifest from the short-range where neighboring nucleotides influence the choice of base at a site, to the long-range, commonly known as isochores, where a particular base composition can span millions of nucleotides. A separate genomic issue that has yet to be thoroughly elucidated is the role that RNA secondary structure (SS) plays in gene expression.

RESULTS: We present novel data and approaches that show that a mid-range inhomogeneity (~30 to 1000 nt) not only exists in mammalian genomes but is also significantly associated with strong RNA SS. A whole-genome bioinformatics investigation of local SS in a set of 11,315 non-redundant human pre-mRNA sequences has been carried out. Four distinct components of these molecules (5'-UTRs, exons, introns and 3'-UTRs) were considered separately, since they differ in overall nucleotide composition, sequence motifs and periodicities. For each pre-mRNA component, the abundance of strong local SS (< -25 kcal/mol) was a factor of two to ten greater than a random expectation model. The randomization process preserves the short-range inhomogeneity of the corresponding natural sequences, thus, eliminating short-range signals as possible contributors to any observed phenomena.

CONCLUSION: We demonstrate that the excess of strong local SS in pre-mRNAs is linked to the little explored phenomenon of genomic mid-range inhomogeneity (MRI). MRI is an interdependence between nucleotide choice and base composition over a distance of 20-1000 nt. Additionally, we have created a public computational resource to support further study of genomic MRI.

DOI10.1186/1471-2164-9-284
PubMed URLhttp://www.ncbi.nlm.nih.gov/pubmed/18549495?dopt=Abstract
PMCPMC2442090
Alternate JournalBMC Genomics
PubMed ID18549495