CROA-based feature selection with BERT model for detecting the offensive speech in Twitter data

R. J. Anandhi, V. S. Anusuya Devi, B. S. Kiruthika Devi*, Balasubramanian Prabhu Kavin, Gan Hong Seng

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

Online hate speech has flourished on social networking sites due to the widespread availability of mobile computers and other Web knowledge. Extensive research has shown that online exposure to hate speech has real-world effects on marginalized communities. Research into methods of automatically identifying hate speech has garnered significant attention. Hate speech can affect any demographic, while some populations are more vulnerable than others. Relying solely on progressive learning is insufficient for achieving the goal of automatic hate speech identification. It need access to large amounts of labelled data to train a model. Inaccurate statistics on hate speech and preconceived notions have been the biggest obstacles in the field of hate speech research for a long time. This research provides a novel strategy for meeting these needs by combining a transfer-learning attitude-based BERT (Bidirectional Encoder Representations from Transformers) with a coral reef optimization-based approach (CROA). A feature selection (FC) optimization strategy for coral reefs, a coral reefs optimization method mimics coral behaviours for reef location and development. We might think of each potential answer to the problem as a coral trying to establish itself in the reefs. The results are refined at each stage by applying specialized operators from the coral reefs optimization algorithm. When everything is said and done, the optimal solution is chosen. We also use a cutting-edge fine-tuning method based on transfer learning to assess BERT’s ability to recognize hostile contexts in social media communications. The paper evaluates the proposed approach using Twitter datasets tagged for racist, sexist, homophobic, or otherwise offensive content. The numbers show that our strategy achieves 5%–10% higher precision and recall compared to other approaches.

Original languageEnglish
Article number1122
JournalJournal of Autonomous Intelligence
Volume7
Issue number3
DOIs
Publication statusPublished - 2024

Keywords

  • Twitter
  • bidirectional encoder representations from transformers
  • coral reefs optimization
  • hate speech detection
  • natural language processing

Fingerprint

Dive into the research topics of 'CROA-based feature selection with BERT model for detecting the offensive speech in Twitter data'. Together they form a unique fingerprint.

Cite this