TY - JOUR
T1 - Self-attention enabled deep learning of dihydrouridine (D) modification on mRNAs unveiled a distinct sequence signature from tRNAs
AU - Wang, Yue
AU - Wang, Xuan
AU - Cui, Xiaodong
AU - Meng, Jia
AU - Rong, Rong
N1 - Funding Information:
This project was supported by the National Science Foundation for Young Scientists of China (grant no. 62003273 ) and Natural Science Basic Research Program of Shaanxi (program no. 2020JQ-217 ).
Publisher Copyright:
© 2023 The Author(s)
PY - 2023/3/14
Y1 - 2023/3/14
N2 - Dihydrouridine (D) is a modified pyrimidine nucleotide universally found in viral, prokaryotic, and eukaryotic species. It serves as a metabolic modulator for various pathological conditions, and its elevated levels in tumors are associated with a series of cancers. Precise identification of D sites on RNA is vital for understanding its biological function. A number of computational approaches have been developed for predicting D sites on tRNAs; however, none have considered mRNAs. We present here DPred, the first computational tool for predicting D on mRNAs in yeast from the primary RNA sequences. Built on a local self-attention layer and a convolutional neural network (CNN) layer, the proposed deep learning model outperformed classic machine learning approaches (random forest, support vector machines, etc.) and achieved reasonable accuracy and reliability with areas under the curve of 0.9166 and 0.9027 in jackknife cross-validation and on an independent testing dataset, respectively. Importantly, we showed that distinct sequence signatures are associated with the D sites on mRNAs and tRNAs, implying potentially different formation mechanisms and putative divergent functionality of this modification on the two types of RNA. DPred is available as a user-friendly Web server.
AB - Dihydrouridine (D) is a modified pyrimidine nucleotide universally found in viral, prokaryotic, and eukaryotic species. It serves as a metabolic modulator for various pathological conditions, and its elevated levels in tumors are associated with a series of cancers. Precise identification of D sites on RNA is vital for understanding its biological function. A number of computational approaches have been developed for predicting D sites on tRNAs; however, none have considered mRNAs. We present here DPred, the first computational tool for predicting D on mRNAs in yeast from the primary RNA sequences. Built on a local self-attention layer and a convolutional neural network (CNN) layer, the proposed deep learning model outperformed classic machine learning approaches (random forest, support vector machines, etc.) and achieved reasonable accuracy and reliability with areas under the curve of 0.9166 and 0.9027 in jackknife cross-validation and on an independent testing dataset, respectively. Importantly, we showed that distinct sequence signatures are associated with the D sites on mRNAs and tRNAs, implying potentially different formation mechanisms and putative divergent functionality of this modification on the two types of RNA. DPred is available as a user-friendly Web server.
KW - CNN
KW - dihydrouridine
KW - epitranscriptomic mark
KW - local self-attention
KW - MT: bioinformatics
KW - RNA modification
KW - sequence-derived features
UR - http://www.scopus.com/inward/record.url?scp=85147686750&partnerID=8YFLogxK
U2 - 10.1016/j.omtn.2023.01.014
DO - 10.1016/j.omtn.2023.01.014
M3 - Article
AN - SCOPUS:85147686750
SN - 2162-2531
VL - 31
SP - 411
EP - 420
JO - Molecular Therapy - Nucleic Acids
JF - Molecular Therapy - Nucleic Acids
ER -