Multifractal complexity analysis-based dynamic media text categorization models by natural language processing with BERT

Yeliz Karaca, Yu Dong Zhang, Shui Hua Wang, Ahu Dereli Dursun

Research output: Chapter in Book or Report/Conference proceedingChapterpeer-review

1 Citation (Scopus)

Abstract

Fractals, being essentially mathematical constructs, are forms that embody the fundamental features of dynamism, self-organization, self-similarity and complexity. The lexical items and parts of sentences are comprehended as the constituents of schemata with a particular pattern made up of interacting elements. Among the most well-known means used to detect and analyze self-repeating patterns are multifractal methods which have numerous applications in many areas including computational linguistics. The predominance of properties like self-similarity, irregularity and vagueness in texts add more to the challenge of clear and accurate meaning conveyance. The ever-increasing amount of text data in different categories also contribute to the inherent complexity due to having properties like being unstructured, noisy and nonstandard. To address this challenge and complexity, this study has aimed at ensuring regularity and self-similarity within the digital-based complex media texts, which comprise the dataset, by multifractal methods (multifractal Bayesian, multifractal regularization and multifractal wavelet shrinkage) and attaining accurate classification and categorization of the words within texts in the dataset by Bidirectional Encoder Representations from Transformers (BERT), as the Natural Language Processing (NLP) method. The related steps of our integrative proposed method are as follows: firstly, regularity enhancement was attained by applying the multifractal methods (multifractal Bayesian, multifractal regularization and multifractal wavelet shrinkage) to the text dataset. Thus, the new datasets were generated, respectively, by obtaining the significant, self-similar and regular attributes. Subsequently, BERT, as the NLP method, was employed to the text dataset as well as to the three new datasets obtained for the classification purposes. In this way, accurate word detection within the text for the category classification was ensured for the analyses. The analysis results for the text dataset and the new datasets were compared by BERT and the most optimal result could be achieved by multifractal Bayesian method. Through this integrated scheme, we have enunciated the significance of the behavioral patterns of fractal while setting forth the distinctive quality of BERT owing to its capability of classification accuracy and adaptiveness into integrated methodologies.

Original languageEnglish
Title of host publicationMulti-Chaos, Fractal and Multi-Fractional Artificial Intelligence of Different Complex Systems
PublisherElsevier
Pages95-115
Number of pages21
ISBN (Electronic)9780323900324
ISBN (Print)9780323886161
DOIs
Publication statusPublished - 1 Jan 2022
Externally publishedYes

Keywords

  • Automatic and effective classification
  • BERT
  • Complex textual analysis of lexical items
  • Complexity
  • Fractals
  • Hidden patterns
  • Hölder regularity
  • Multi-fractal complexity
  • Multifractal analysis
  • Self-similarity and irregularity

Fingerprint

Dive into the research topics of 'Multifractal complexity analysis-based dynamic media text categorization models by natural language processing with BERT'. Together they form a unique fingerprint.

Cite this