TY - JOUR
T1 - Enhancing Epitranscriptome Module Detection from m6A-Seq Data Using Threshold-Based Measurement Weighting Strategy
AU - Chen, Kunqi
AU - Wei, Zhen
AU - Liu, Hui
AU - Magalhães, João Pedro De
AU - Rong, Rong
AU - Lu, Zhiliang
AU - Meng, Jia
N1 - Funding Information:
The authors thank computational support from the UTSA Computational Systems Biology Core, funded by the National Institute on Minority Health and Health Disparities (G12MD007591) from the National Institutes of Health.
Funding Information:
Jia Meng, Rong Rong, Zhiliang Lu, and João Pedro de Magalhães conceived the idea and designed the research; Kunqi Chen, Zhen Wei, and Hui Liu implemented the analysis. Kunqi Chen drafted the manuscript. All authors read, critically revised, and approved the final manuscript. This work has also been supported by National Natural Science Foundation of China [31671373 and 61401370], Jiangsu University Natural Science Program [16KJB180027], and Jiangsu Science and Technology Program [BK20140403].
Publisher Copyright:
© 2018 Kunqi Chen et al.
PY - 2018
Y1 - 2018
N2 - To date, with well over 100 different types of RNA modifications associated with various molecular functions identified on diverse types of RNA molecules, the epitranscriptome has emerged to be an important layer for gene expression regulation. It is of crucial importance and increasing interest to understand how the epitranscriptome is regulated to facilitate different biological functions from a global perspective, which may be carried forward by finding biologically meaningful epitranscriptome modules that respond to upstream epitranscriptome regulators and lead to downstream biological functions; however, due to the intrinsic properties of RNA molecules, RNA modifications, and relevant sequencing technique, the epitranscriptome profiled from high-throughput sequencing approaches often suffers from various artifacts, jeopardizing the effectiveness of epitranscriptome modules identification when using conventional approaches. To solve this problem, we developed a convenient measurement weighting strategy, which can largely tolerate the artifacts of high-throughput sequencing data. We demonstrated on real data that the proposed measurement weighting strategy indeed brings improved performance in epitranscriptome module discovery in terms of both module accuracy and biological significance. Although the new approach is integrated with Euclidean distance measurement in a hierarchical clustering scenario, it has great potential to be extended to other distance measurements and algorithms as well for addressing various tasks in epitranscriptome analysis. Additionally, we show for the first time with rigorous statistical analysis that the epitranscriptome modules are biologically meaningful with different GO functions enriched, which established the functional basis of epitranscriptome modules, fulfilled a key prerequisite for functional characterization, and deciphered the epitranscriptome and its regulation.
AB - To date, with well over 100 different types of RNA modifications associated with various molecular functions identified on diverse types of RNA molecules, the epitranscriptome has emerged to be an important layer for gene expression regulation. It is of crucial importance and increasing interest to understand how the epitranscriptome is regulated to facilitate different biological functions from a global perspective, which may be carried forward by finding biologically meaningful epitranscriptome modules that respond to upstream epitranscriptome regulators and lead to downstream biological functions; however, due to the intrinsic properties of RNA molecules, RNA modifications, and relevant sequencing technique, the epitranscriptome profiled from high-throughput sequencing approaches often suffers from various artifacts, jeopardizing the effectiveness of epitranscriptome modules identification when using conventional approaches. To solve this problem, we developed a convenient measurement weighting strategy, which can largely tolerate the artifacts of high-throughput sequencing data. We demonstrated on real data that the proposed measurement weighting strategy indeed brings improved performance in epitranscriptome module discovery in terms of both module accuracy and biological significance. Although the new approach is integrated with Euclidean distance measurement in a hierarchical clustering scenario, it has great potential to be extended to other distance measurements and algorithms as well for addressing various tasks in epitranscriptome analysis. Additionally, we show for the first time with rigorous statistical analysis that the epitranscriptome modules are biologically meaningful with different GO functions enriched, which established the functional basis of epitranscriptome modules, fulfilled a key prerequisite for functional characterization, and deciphered the epitranscriptome and its regulation.
UR - http://www.scopus.com/inward/record.url?scp=85049320200&partnerID=8YFLogxK
U2 - 10.1155/2018/2075173
DO - 10.1155/2018/2075173
M3 - Article
C2 - 30013979
AN - SCOPUS:85049320200
SN - 2314-6133
VL - 2018
JO - BioMed Research International
JF - BioMed Research International
M1 - 2075173
ER -