ISGm1A: Integration of Sequence Features and Genomic Features to Improve the Prediction of Human m1A RNA Methylation Sites

Lian Liu, Xiujuan Lei*, Jia Meng, Zhen Wei

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

8 Citations (Scopus)

Abstract

As a new epitranscriptomic modification, N1-methyladenosine (m1A) plays an important role in the gene expression regulation. Although some computational methods were proposed to predict m1A modification sites, all of these methods apply machine learning predictions based on the nucleotide sequence features, and they missed the layer of information in transcript topology and RNA secondary structures. To enhance the prediction model of m1A RNA methylation, we proposed a computational framework, ISGm1A, which stands for integration sequence features and genomic features to improve the prediction of human m1A RNA methylation sites. Based on the random forest algorithm, ISGm1A takes advantage of both conventional sequence features and 75 genomic characteristics to improve the prediction performance of m1A sites in human. The results of five-fold cross validation and independent test show that ISGm1A outperforms other prediction algorithms (AUC = 0.903 and 0.909). In addition, through analyzing the importance of features, we found that the genomic features play a more important role in site prediction than the sequence features. Furthermore, with ISGm1A, we generated a high accuracy map of m1A by predicting all adenines sites in the transcriptome. The data and the results of the study are freely accessible at: https://github.com/lianliu09/m1a_prediction.git.

Original languageEnglish
Article number9079809
Pages (from-to)81971-81977
Number of pages7
JournalIEEE Access
Volume8
DOIs
Publication statusPublished - 2020

Keywords

  • Epitranscriptome
  • genomic features
  • m¹A
  • sequence features
  • site prediction

Cite this