Graphical Representation and Similarity Analysis of DNA Sequences Based on Trigonometric Functions

Guo Sen Xie*, Xiao Bo Jin, Chunlei Yang, Jiexin Pu, Zhongxi Mo

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

13 Citations (Scopus)

Abstract

In this paper, we propose two four-base related 2D curves of DNA primary sequences (termed as F-B curves) and their corresponding single-base related 2D curves (termed as A-related, G-related, T-related and C-related curves). The constructions of these graphical curves are based on the assignments of individual base to four different sinusoidal (or tangent) functions; then by connecting all these points on these four sinusoidal (tangent) functions, we can get the F-B curves; similarly, by connecting the points on each of the four sinusoidal (tangent) functions, we get the single-base related 2D curves. The proposed 2D curves are all strictly non degenerate. Then, a 8-component characteristic vector is constructed to compare similarity among DNA sequences from different species based on a normalized geometrical centers of the proposed curves. As examples, we examine similarity among the coding sequences of the first exon of beta-globin gene from eleven species, similarity of cDNA sequences of beta-globin gene from eight species, and similarity of the whole mitochondrial genomes of 18 eutherian mammals. The experimental results well demonstrate the effectiveness of the proposed method.

Original languageEnglish
Pages (from-to)113-133
Number of pages21
JournalActa Biotheoretica
Volume66
Issue number2
DOIs
Publication statusPublished - 1 Jun 2018
Externally publishedYes

Keywords

  • DNA sequences
  • Similarity
  • Sinusoidal function
  • Tangent function

Cite this