CROA-based feature selection with BERT model for detecting the offensive speech in Twitter data

R. J. Anandhi; V. S. Anusuya Devi; B. S. Kiruthika Devi; Balasubramanian Prabhu Kavin; Gan Hong Seng

doi:10.32629/jai.v7i3.1122

CROA-based feature selection with BERT model for detecting the offensive speech in Twitter data

R. J. Anandhi, V. S. Anusuya Devi, B. S. Kiruthika Devi^*, Balasubramanian Prabhu Kavin, Gan Hong Seng

^*Corresponding author for this work

School of AI and Advanced Computing

Research output: Contribution to journal › Article › peer-review

2 Citations (Scopus)

Abstract

Online hate speech has flourished on social networking sites due to the widespread availability of mobile computers and other Web knowledge. Extensive research has shown that online exposure to hate speech has real-world effects on marginalized communities. Research into methods of automatically identifying hate speech has garnered significant attention. Hate speech can affect any demographic, while some populations are more vulnerable than others. Relying solely on progressive learning is insufficient for achieving the goal of automatic hate speech identification. It need access to large amounts of labelled data to train a model. Inaccurate statistics on hate speech and preconceived notions have been the biggest obstacles in the field of hate speech research for a long time. This research provides a novel strategy for meeting these needs by combining a transfer-learning attitude-based BERT (Bidirectional Encoder Representations from Transformers) with a coral reef optimization-based approach (CROA). A feature selection (FC) optimization strategy for coral reefs, a coral reefs optimization method mimics coral behaviours for reef location and development. We might think of each potential answer to the problem as a coral trying to establish itself in the reefs. The results are refined at each stage by applying specialized operators from the coral reefs optimization algorithm. When everything is said and done, the optimal solution is chosen. We also use a cutting-edge fine-tuning method based on transfer learning to assess BERT’s ability to recognize hostile contexts in social media communications. The paper evaluates the proposed approach using Twitter datasets tagged for racist, sexist, homophobic, or otherwise offensive content. The numbers show that our strategy achieves 5%–10% higher precision and recall compared to other approaches.

Original language	English
Article number	1122
Journal	Journal of Autonomous Intelligence
Volume	7
Issue number	3
DOIs	https://doi.org/10.32629/jai.v7i3.1122
Publication status	Published - 2024

Keywords

Twitter
bidirectional encoder representations from transformers
coral reefs optimization
hate speech detection
natural language processing

Access to Document

10.32629/jai.v7i3.1122

Cite this

@article{0335cbb4f0a24af4ae7f8e5b55b6f4ed,

title = "CROA-based feature selection with BERT model for detecting the offensive speech in Twitter data",

abstract = "Online hate speech has flourished on social networking sites due to the widespread availability of mobile computers and other Web knowledge. Extensive research has shown that online exposure to hate speech has real-world effects on marginalized communities. Research into methods of automatically identifying hate speech has garnered significant attention. Hate speech can affect any demographic, while some populations are more vulnerable than others. Relying solely on progressive learning is insufficient for achieving the goal of automatic hate speech identification. It need access to large amounts of labelled data to train a model. Inaccurate statistics on hate speech and preconceived notions have been the biggest obstacles in the field of hate speech research for a long time. This research provides a novel strategy for meeting these needs by combining a transfer-learning attitude-based BERT (Bidirectional Encoder Representations from Transformers) with a coral reef optimization-based approach (CROA). A feature selection (FC) optimization strategy for coral reefs, a coral reefs optimization method mimics coral behaviours for reef location and development. We might think of each potential answer to the problem as a coral trying to establish itself in the reefs. The results are refined at each stage by applying specialized operators from the coral reefs optimization algorithm. When everything is said and done, the optimal solution is chosen. We also use a cutting-edge fine-tuning method based on transfer learning to assess BERT{\textquoteright}s ability to recognize hostile contexts in social media communications. The paper evaluates the proposed approach using Twitter datasets tagged for racist, sexist, homophobic, or otherwise offensive content. The numbers show that our strategy achieves 5%–10% higher precision and recall compared to other approaches.",

keywords = "Twitter, bidirectional encoder representations from transformers, coral reefs optimization, hate speech detection, natural language processing",

author = "Anandhi, {R. J.} and {Anusuya Devi}, {V. S.} and {Kiruthika Devi}, {B. S.} and Kavin, {Balasubramanian Prabhu} and Seng, {Gan Hong}",

note = "Publisher Copyright: {\textcopyright} 2024 by author(s).",

year = "2024",

doi = "10.32629/jai.v7i3.1122",

language = "English",

volume = "7",

journal = "Journal of Autonomous Intelligence",

issn = "2630-5046",

number = "3",

}

TY - JOUR

T1 - CROA-based feature selection with BERT model for detecting the offensive speech in Twitter data

AU - Anandhi, R. J.

AU - Anusuya Devi, V. S.

AU - Kiruthika Devi, B. S.

AU - Kavin, Balasubramanian Prabhu

AU - Seng, Gan Hong

PY - 2024

Y1 - 2024

N2 - Online hate speech has flourished on social networking sites due to the widespread availability of mobile computers and other Web knowledge. Extensive research has shown that online exposure to hate speech has real-world effects on marginalized communities. Research into methods of automatically identifying hate speech has garnered significant attention. Hate speech can affect any demographic, while some populations are more vulnerable than others. Relying solely on progressive learning is insufficient for achieving the goal of automatic hate speech identification. It need access to large amounts of labelled data to train a model. Inaccurate statistics on hate speech and preconceived notions have been the biggest obstacles in the field of hate speech research for a long time. This research provides a novel strategy for meeting these needs by combining a transfer-learning attitude-based BERT (Bidirectional Encoder Representations from Transformers) with a coral reef optimization-based approach (CROA). A feature selection (FC) optimization strategy for coral reefs, a coral reefs optimization method mimics coral behaviours for reef location and development. We might think of each potential answer to the problem as a coral trying to establish itself in the reefs. The results are refined at each stage by applying specialized operators from the coral reefs optimization algorithm. When everything is said and done, the optimal solution is chosen. We also use a cutting-edge fine-tuning method based on transfer learning to assess BERT’s ability to recognize hostile contexts in social media communications. The paper evaluates the proposed approach using Twitter datasets tagged for racist, sexist, homophobic, or otherwise offensive content. The numbers show that our strategy achieves 5%–10% higher precision and recall compared to other approaches.

AB - Online hate speech has flourished on social networking sites due to the widespread availability of mobile computers and other Web knowledge. Extensive research has shown that online exposure to hate speech has real-world effects on marginalized communities. Research into methods of automatically identifying hate speech has garnered significant attention. Hate speech can affect any demographic, while some populations are more vulnerable than others. Relying solely on progressive learning is insufficient for achieving the goal of automatic hate speech identification. It need access to large amounts of labelled data to train a model. Inaccurate statistics on hate speech and preconceived notions have been the biggest obstacles in the field of hate speech research for a long time. This research provides a novel strategy for meeting these needs by combining a transfer-learning attitude-based BERT (Bidirectional Encoder Representations from Transformers) with a coral reef optimization-based approach (CROA). A feature selection (FC) optimization strategy for coral reefs, a coral reefs optimization method mimics coral behaviours for reef location and development. We might think of each potential answer to the problem as a coral trying to establish itself in the reefs. The results are refined at each stage by applying specialized operators from the coral reefs optimization algorithm. When everything is said and done, the optimal solution is chosen. We also use a cutting-edge fine-tuning method based on transfer learning to assess BERT’s ability to recognize hostile contexts in social media communications. The paper evaluates the proposed approach using Twitter datasets tagged for racist, sexist, homophobic, or otherwise offensive content. The numbers show that our strategy achieves 5%–10% higher precision and recall compared to other approaches.

KW - Twitter

KW - bidirectional encoder representations from transformers

KW - coral reefs optimization

KW - hate speech detection

KW - natural language processing

UR - http://www.scopus.com/inward/record.url?scp=85182234585&partnerID=8YFLogxK

U2 - 10.32629/jai.v7i3.1122

DO - 10.32629/jai.v7i3.1122

M3 - Article

AN - SCOPUS:85182234585

SN - 2630-5046

VL - 7

JO - Journal of Autonomous Intelligence

JF - Journal of Autonomous Intelligence

IS - 3

M1 - 1122

ER -

CROA-based feature selection with BERT model for detecting the offensive speech in Twitter data

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this