Prediction of MicroRNA Precursors Using Parsimonious Feature Sets.

TitlePrediction of MicroRNA Precursors Using Parsimonious Feature Sets.
Publication TypeJournal Article
Year of Publication2014
AuthorsStepanowsky P, Levy E, Kim J, Jiang X, Ohno-Machado L
JournalCancer Inform
IssueSuppl 1
Date Published2014

MicroRNAs (miRNAs) are a class of short noncoding RNAs that regulate gene expression through base pairing with messenger RNAs. Due to the interest in studying miRNA dysregulation in disease and limits of validated miRNA references, identification of novel miRNAs is a critical task. The performance of different models to predict novel miRNAs varies with the features chosen as predictors. However, no study has systematically compared published feature sets. We constructed a comprehensive feature set using the minimum free energy of the secondary structure of precursor miRNAs, a set of nucleotide-structure triplets, and additional extracted sequence and structure characteristics. We then compared the predictive value of our comprehensive feature set to those from three previously published studies, using logistic regression and random forest classifiers. We found that classifiers containing as few as seven highly predictive features are able to predict novel precursor miRNAs as well as classifiers that use larger feature sets. In a real data set, our method correctly identified the holdout miRNAs relevant to renal cancer.

PubMed URL
Alternate TitleCancer Inform
PubMed ID25392687
PubMed Central IDPMC4216048
Grant ListT15 LM011271 / LM / NLM NIH HHS / United States
Biomedical Informatics