Multifractal complexity analysis-based dynamic media text categorization models by natural language processing with BERT

Yeliz Karaca; Yu Dong Zhang; Shui Hua Wang; Ahu Dereli Dursun

doi:10.1016/B978-0-323-90032-4.00012-2

Multifractal complexity analysis-based dynamic media text categorization models by natural language processing with BERT

Yeliz Karaca, Yu Dong Zhang, Shui Hua Wang, Ahu Dereli Dursun

Research output: Chapter in Book or Report/Conference proceeding › Chapter › peer-review

2 Citations (Scopus)

Abstract

Fractals, being essentially mathematical constructs, are forms that embody the fundamental features of dynamism, self-organization, self-similarity and complexity. The lexical items and parts of sentences are comprehended as the constituents of schemata with a particular pattern made up of interacting elements. Among the most well-known means used to detect and analyze self-repeating patterns are multifractal methods which have numerous applications in many areas including computational linguistics. The predominance of properties like self-similarity, irregularity and vagueness in texts add more to the challenge of clear and accurate meaning conveyance. The ever-increasing amount of text data in different categories also contribute to the inherent complexity due to having properties like being unstructured, noisy and nonstandard. To address this challenge and complexity, this study has aimed at ensuring regularity and self-similarity within the digital-based complex media texts, which comprise the dataset, by multifractal methods (multifractal Bayesian, multifractal regularization and multifractal wavelet shrinkage) and attaining accurate classification and categorization of the words within texts in the dataset by Bidirectional Encoder Representations from Transformers (BERT), as the Natural Language Processing (NLP) method. The related steps of our integrative proposed method are as follows: firstly, regularity enhancement was attained by applying the multifractal methods (multifractal Bayesian, multifractal regularization and multifractal wavelet shrinkage) to the text dataset. Thus, the new datasets were generated, respectively, by obtaining the significant, self-similar and regular attributes. Subsequently, BERT, as the NLP method, was employed to the text dataset as well as to the three new datasets obtained for the classification purposes. In this way, accurate word detection within the text for the category classification was ensured for the analyses. The analysis results for the text dataset and the new datasets were compared by BERT and the most optimal result could be achieved by multifractal Bayesian method. Through this integrated scheme, we have enunciated the significance of the behavioral patterns of fractal while setting forth the distinctive quality of BERT owing to its capability of classification accuracy and adaptiveness into integrated methodologies.

Original language	English
Title of host publication	Multi-Chaos, Fractal and Multi-Fractional Artificial Intelligence of Different Complex Systems
Publisher	Elsevier
Pages	95-115
Number of pages	21
ISBN (Electronic)	9780323900324
ISBN (Print)	9780323886161
DOIs	https://doi.org/10.1016/B978-0-323-90032-4.00012-2
Publication status	Published - 1 Jan 2022
Externally published	Yes

Keywords

Automatic and effective classification
BERT
Complex textual analysis of lexical items
Complexity
Fractals
Hidden patterns
Hölder regularity
Multi-fractal complexity
Multifractal analysis
Self-similarity and irregularity

Access to Document

10.1016/B978-0-323-90032-4.00012-2

Cite this

@inbook{51913ee0b75846849aa9646847510c9f,

title = "Multifractal complexity analysis-based dynamic media text categorization models by natural language processing with BERT",

abstract = "Fractals, being essentially mathematical constructs, are forms that embody the fundamental features of dynamism, self-organization, self-similarity and complexity. The lexical items and parts of sentences are comprehended as the constituents of schemata with a particular pattern made up of interacting elements. Among the most well-known means used to detect and analyze self-repeating patterns are multifractal methods which have numerous applications in many areas including computational linguistics. The predominance of properties like self-similarity, irregularity and vagueness in texts add more to the challenge of clear and accurate meaning conveyance. The ever-increasing amount of text data in different categories also contribute to the inherent complexity due to having properties like being unstructured, noisy and nonstandard. To address this challenge and complexity, this study has aimed at ensuring regularity and self-similarity within the digital-based complex media texts, which comprise the dataset, by multifractal methods (multifractal Bayesian, multifractal regularization and multifractal wavelet shrinkage) and attaining accurate classification and categorization of the words within texts in the dataset by Bidirectional Encoder Representations from Transformers (BERT), as the Natural Language Processing (NLP) method. The related steps of our integrative proposed method are as follows: firstly, regularity enhancement was attained by applying the multifractal methods (multifractal Bayesian, multifractal regularization and multifractal wavelet shrinkage) to the text dataset. Thus, the new datasets were generated, respectively, by obtaining the significant, self-similar and regular attributes. Subsequently, BERT, as the NLP method, was employed to the text dataset as well as to the three new datasets obtained for the classification purposes. In this way, accurate word detection within the text for the category classification was ensured for the analyses. The analysis results for the text dataset and the new datasets were compared by BERT and the most optimal result could be achieved by multifractal Bayesian method. Through this integrated scheme, we have enunciated the significance of the behavioral patterns of fractal while setting forth the distinctive quality of BERT owing to its capability of classification accuracy and adaptiveness into integrated methodologies.",

keywords = "Automatic and effective classification, BERT, Complex textual analysis of lexical items, Complexity, Fractals, Hidden patterns, H{\"o}lder regularity, Multi-fractal complexity, Multifractal analysis, Self-similarity and irregularity",

author = "Yeliz Karaca and Zhang, {Yu Dong} and Wang, {Shui Hua} and Dursun, {Ahu Dereli}",

year = "2022",

month = jan,

day = "1",

doi = "10.1016/B978-0-323-90032-4.00012-2",

language = "English",

isbn = "9780323886161",

pages = "95--115",

booktitle = "Multi-Chaos, Fractal and Multi-Fractional Artificial Intelligence of Different Complex Systems",

publisher = "Elsevier",

}

Multifractal complexity analysis-based dynamic media text categorization models by natural language processing with BERT. / Karaca, Yeliz; Zhang, Yu Dong; Wang, Shui Hua et al.
Multi-Chaos, Fractal and Multi-Fractional Artificial Intelligence of Different Complex Systems. Elsevier, 2022. p. 95-115.

Research output: Chapter in Book or Report/Conference proceeding › Chapter › peer-review

TY - CHAP

T1 - Multifractal complexity analysis-based dynamic media text categorization models by natural language processing with BERT

AU - Karaca, Yeliz

AU - Zhang, Yu Dong

AU - Wang, Shui Hua

AU - Dursun, Ahu Dereli

PY - 2022/1/1

Y1 - 2022/1/1

N2 - Fractals, being essentially mathematical constructs, are forms that embody the fundamental features of dynamism, self-organization, self-similarity and complexity. The lexical items and parts of sentences are comprehended as the constituents of schemata with a particular pattern made up of interacting elements. Among the most well-known means used to detect and analyze self-repeating patterns are multifractal methods which have numerous applications in many areas including computational linguistics. The predominance of properties like self-similarity, irregularity and vagueness in texts add more to the challenge of clear and accurate meaning conveyance. The ever-increasing amount of text data in different categories also contribute to the inherent complexity due to having properties like being unstructured, noisy and nonstandard. To address this challenge and complexity, this study has aimed at ensuring regularity and self-similarity within the digital-based complex media texts, which comprise the dataset, by multifractal methods (multifractal Bayesian, multifractal regularization and multifractal wavelet shrinkage) and attaining accurate classification and categorization of the words within texts in the dataset by Bidirectional Encoder Representations from Transformers (BERT), as the Natural Language Processing (NLP) method. The related steps of our integrative proposed method are as follows: firstly, regularity enhancement was attained by applying the multifractal methods (multifractal Bayesian, multifractal regularization and multifractal wavelet shrinkage) to the text dataset. Thus, the new datasets were generated, respectively, by obtaining the significant, self-similar and regular attributes. Subsequently, BERT, as the NLP method, was employed to the text dataset as well as to the three new datasets obtained for the classification purposes. In this way, accurate word detection within the text for the category classification was ensured for the analyses. The analysis results for the text dataset and the new datasets were compared by BERT and the most optimal result could be achieved by multifractal Bayesian method. Through this integrated scheme, we have enunciated the significance of the behavioral patterns of fractal while setting forth the distinctive quality of BERT owing to its capability of classification accuracy and adaptiveness into integrated methodologies.

AB - Fractals, being essentially mathematical constructs, are forms that embody the fundamental features of dynamism, self-organization, self-similarity and complexity. The lexical items and parts of sentences are comprehended as the constituents of schemata with a particular pattern made up of interacting elements. Among the most well-known means used to detect and analyze self-repeating patterns are multifractal methods which have numerous applications in many areas including computational linguistics. The predominance of properties like self-similarity, irregularity and vagueness in texts add more to the challenge of clear and accurate meaning conveyance. The ever-increasing amount of text data in different categories also contribute to the inherent complexity due to having properties like being unstructured, noisy and nonstandard. To address this challenge and complexity, this study has aimed at ensuring regularity and self-similarity within the digital-based complex media texts, which comprise the dataset, by multifractal methods (multifractal Bayesian, multifractal regularization and multifractal wavelet shrinkage) and attaining accurate classification and categorization of the words within texts in the dataset by Bidirectional Encoder Representations from Transformers (BERT), as the Natural Language Processing (NLP) method. The related steps of our integrative proposed method are as follows: firstly, regularity enhancement was attained by applying the multifractal methods (multifractal Bayesian, multifractal regularization and multifractal wavelet shrinkage) to the text dataset. Thus, the new datasets were generated, respectively, by obtaining the significant, self-similar and regular attributes. Subsequently, BERT, as the NLP method, was employed to the text dataset as well as to the three new datasets obtained for the classification purposes. In this way, accurate word detection within the text for the category classification was ensured for the analyses. The analysis results for the text dataset and the new datasets were compared by BERT and the most optimal result could be achieved by multifractal Bayesian method. Through this integrated scheme, we have enunciated the significance of the behavioral patterns of fractal while setting forth the distinctive quality of BERT owing to its capability of classification accuracy and adaptiveness into integrated methodologies.

KW - Automatic and effective classification

KW - BERT

KW - Complex textual analysis of lexical items

KW - Complexity

KW - Fractals

KW - Hidden patterns

KW - Hölder regularity

KW - Multi-fractal complexity

KW - Multifractal analysis

KW - Self-similarity and irregularity

UR - http://www.scopus.com/inward/record.url?scp=85137904790&partnerID=8YFLogxK

U2 - 10.1016/B978-0-323-90032-4.00012-2

DO - 10.1016/B978-0-323-90032-4.00012-2

M3 - Chapter

AN - SCOPUS:85137904790

SN - 9780323886161

SP - 95

EP - 115

BT - Multi-Chaos, Fractal and Multi-Fractional Artificial Intelligence of Different Complex Systems

PB - Elsevier

ER -

Multifractal complexity analysis-based dynamic media text categorization models by natural language processing with BERT

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this