A Natural Language Processing (NLP) Evaluation on COVID-19 Rumour Dataset Using Deep Learning Techniques

Rubia Fatima; Naila Samad Shaikh; Adnan Riaz; Sadique Ahmad; Mohammed A. El-Affendi; Khaled A.Z. Alyamani; Muhammad Nabeel; Javed Ali Khan; Affan Yasin; Rana M.Amir Latif

doi:10.1155/2022/6561622

A Natural Language Processing (NLP) Evaluation on COVID-19 Rumour Dataset Using Deep Learning Techniques

Rubia Fatima, Naila Samad Shaikh, Adnan Riaz, Sadique Ahmad, Mohammed A. El-Affendi, Khaled A.Z. Alyamani, Muhammad Nabeel, Javed Ali Khan, Affan Yasin^*, Rana M.Amir Latif

^*Corresponding author for this work

Research output: Contribution to journal › Article › peer-review

18 Citations (Scopus)

Abstract

Context and Background: Since December 2019, the coronavirus (COVID-19) epidemic has sparked considerable alarm among the general community and significantly affected societal attitudes and perceptions. Apart from the disease itself, many people suffer from anxiety and depression due to the disease and the present threat of an outbreak. Due to the fast propagation of the virus and misleading/fake information, the issues of public discourse alter, resulting in significant confusion in certain places. Rumours are unproven facts or stories that propagate and promote sentiments of prejudice, hatred, and fear. Objective. The study's objective is to propose a novel solution to detect fake news using state-of-the-art machines and deep learning models. Furthermore, to analyse which models outperformed in detecting the fake news. Method. In the research study, we adapted a COVID-19 rumours dataset, which incorporates rumours from news websites and tweets, together with information about the rumours. It is important to analyse data utilizing Natural Language Processing (NLP) and Deep Learning (DL) approaches. Based on the accuracy, precision, recall, and the f1 score, we can assess the effectiveness of the ML and DL algorithms. Results. The data adopted from the source (mentioned in the paper) have collected 9200 comments from Google and 34,779 Twitter postings filtered for phrases connected with COVID-19-related fake news. Experiment 1. The dataset was assessed using the following three criteria: veracity, stance, and sentiment. In these terms, we have different labels, and we have applied the DL algorithms separately to each term. We have used different models in the experiment such as (i) LSTM and (ii) Temporal Convolution Networks (TCN). The TCN model has more performance on each measurement parameter in the evaluated results. So, we have used the TCN model for the practical implication for better findings. Experiment 2. In the second experiment, we have used different state-of-the-art deep learning models and algorithms such as (i) Simple RNN; (ii) LSTM + Word Embedding; (iii) Bidirectional + Word Embedding; (iv) LSTM + CNN-1D; and (v) BERT. Furthermore, we have evaluated the performance of these models on all three datasets, e.g., veracity, stance, and sentiment. Based on our second experimental evaluation, the BERT has a superior performance over the other models compared.

Original language	English
Article number	6561622
Journal	Computational Intelligence and Neuroscience
Volume	2022
DOIs	https://doi.org/10.1155/2022/6561622
Publication status	Published - 2022
Externally published	Yes

Access to Document

10.1155/2022/6561622

Cite this

@article{deab521a7dfa49e6adb164169c734fdb,

title = "A Natural Language Processing (NLP) Evaluation on COVID-19 Rumour Dataset Using Deep Learning Techniques",

abstract = "Context and Background: Since December 2019, the coronavirus (COVID-19) epidemic has sparked considerable alarm among the general community and significantly affected societal attitudes and perceptions. Apart from the disease itself, many people suffer from anxiety and depression due to the disease and the present threat of an outbreak. Due to the fast propagation of the virus and misleading/fake information, the issues of public discourse alter, resulting in significant confusion in certain places. Rumours are unproven facts or stories that propagate and promote sentiments of prejudice, hatred, and fear. Objective. The study's objective is to propose a novel solution to detect fake news using state-of-the-art machines and deep learning models. Furthermore, to analyse which models outperformed in detecting the fake news. Method. In the research study, we adapted a COVID-19 rumours dataset, which incorporates rumours from news websites and tweets, together with information about the rumours. It is important to analyse data utilizing Natural Language Processing (NLP) and Deep Learning (DL) approaches. Based on the accuracy, precision, recall, and the f1 score, we can assess the effectiveness of the ML and DL algorithms. Results. The data adopted from the source (mentioned in the paper) have collected 9200 comments from Google and 34,779 Twitter postings filtered for phrases connected with COVID-19-related fake news. Experiment 1. The dataset was assessed using the following three criteria: veracity, stance, and sentiment. In these terms, we have different labels, and we have applied the DL algorithms separately to each term. We have used different models in the experiment such as (i) LSTM and (ii) Temporal Convolution Networks (TCN). The TCN model has more performance on each measurement parameter in the evaluated results. So, we have used the TCN model for the practical implication for better findings. Experiment 2. In the second experiment, we have used different state-of-the-art deep learning models and algorithms such as (i) Simple RNN; (ii) LSTM + Word Embedding; (iii) Bidirectional + Word Embedding; (iv) LSTM + CNN-1D; and (v) BERT. Furthermore, we have evaluated the performance of these models on all three datasets, e.g., veracity, stance, and sentiment. Based on our second experimental evaluation, the BERT has a superior performance over the other models compared.",

author = "Rubia Fatima and {Samad Shaikh}, Naila and Adnan Riaz and Sadique Ahmad and El-Affendi, {Mohammed A.} and Alyamani, {Khaled A.Z.} and Muhammad Nabeel and {Ali Khan}, Javed and Affan Yasin and Latif, {Rana M.Amir}",

note = "Publisher Copyright: {\textcopyright} 2022 Rubia Fatima et al.",

year = "2022",

doi = "10.1155/2022/6561622",

language = "English",

volume = "2022",

journal = "Computational Intelligence and Neuroscience",

issn = "1687-5265",

}

TY - JOUR

T1 - A Natural Language Processing (NLP) Evaluation on COVID-19 Rumour Dataset Using Deep Learning Techniques

AU - Fatima, Rubia

AU - Samad Shaikh, Naila

AU - Riaz, Adnan

AU - Ahmad, Sadique

AU - El-Affendi, Mohammed A.

AU - Alyamani, Khaled A.Z.

AU - Nabeel, Muhammad

AU - Ali Khan, Javed

AU - Yasin, Affan

AU - Latif, Rana M.Amir

PY - 2022

Y1 - 2022

N2 - Context and Background: Since December 2019, the coronavirus (COVID-19) epidemic has sparked considerable alarm among the general community and significantly affected societal attitudes and perceptions. Apart from the disease itself, many people suffer from anxiety and depression due to the disease and the present threat of an outbreak. Due to the fast propagation of the virus and misleading/fake information, the issues of public discourse alter, resulting in significant confusion in certain places. Rumours are unproven facts or stories that propagate and promote sentiments of prejudice, hatred, and fear. Objective. The study's objective is to propose a novel solution to detect fake news using state-of-the-art machines and deep learning models. Furthermore, to analyse which models outperformed in detecting the fake news. Method. In the research study, we adapted a COVID-19 rumours dataset, which incorporates rumours from news websites and tweets, together with information about the rumours. It is important to analyse data utilizing Natural Language Processing (NLP) and Deep Learning (DL) approaches. Based on the accuracy, precision, recall, and the f1 score, we can assess the effectiveness of the ML and DL algorithms. Results. The data adopted from the source (mentioned in the paper) have collected 9200 comments from Google and 34,779 Twitter postings filtered for phrases connected with COVID-19-related fake news. Experiment 1. The dataset was assessed using the following three criteria: veracity, stance, and sentiment. In these terms, we have different labels, and we have applied the DL algorithms separately to each term. We have used different models in the experiment such as (i) LSTM and (ii) Temporal Convolution Networks (TCN). The TCN model has more performance on each measurement parameter in the evaluated results. So, we have used the TCN model for the practical implication for better findings. Experiment 2. In the second experiment, we have used different state-of-the-art deep learning models and algorithms such as (i) Simple RNN; (ii) LSTM + Word Embedding; (iii) Bidirectional + Word Embedding; (iv) LSTM + CNN-1D; and (v) BERT. Furthermore, we have evaluated the performance of these models on all three datasets, e.g., veracity, stance, and sentiment. Based on our second experimental evaluation, the BERT has a superior performance over the other models compared.

AB - Context and Background: Since December 2019, the coronavirus (COVID-19) epidemic has sparked considerable alarm among the general community and significantly affected societal attitudes and perceptions. Apart from the disease itself, many people suffer from anxiety and depression due to the disease and the present threat of an outbreak. Due to the fast propagation of the virus and misleading/fake information, the issues of public discourse alter, resulting in significant confusion in certain places. Rumours are unproven facts or stories that propagate and promote sentiments of prejudice, hatred, and fear. Objective. The study's objective is to propose a novel solution to detect fake news using state-of-the-art machines and deep learning models. Furthermore, to analyse which models outperformed in detecting the fake news. Method. In the research study, we adapted a COVID-19 rumours dataset, which incorporates rumours from news websites and tweets, together with information about the rumours. It is important to analyse data utilizing Natural Language Processing (NLP) and Deep Learning (DL) approaches. Based on the accuracy, precision, recall, and the f1 score, we can assess the effectiveness of the ML and DL algorithms. Results. The data adopted from the source (mentioned in the paper) have collected 9200 comments from Google and 34,779 Twitter postings filtered for phrases connected with COVID-19-related fake news. Experiment 1. The dataset was assessed using the following three criteria: veracity, stance, and sentiment. In these terms, we have different labels, and we have applied the DL algorithms separately to each term. We have used different models in the experiment such as (i) LSTM and (ii) Temporal Convolution Networks (TCN). The TCN model has more performance on each measurement parameter in the evaluated results. So, we have used the TCN model for the practical implication for better findings. Experiment 2. In the second experiment, we have used different state-of-the-art deep learning models and algorithms such as (i) Simple RNN; (ii) LSTM + Word Embedding; (iii) Bidirectional + Word Embedding; (iv) LSTM + CNN-1D; and (v) BERT. Furthermore, we have evaluated the performance of these models on all three datasets, e.g., veracity, stance, and sentiment. Based on our second experimental evaluation, the BERT has a superior performance over the other models compared.

UR - http://www.scopus.com/inward/record.url?scp=85138658161&partnerID=8YFLogxK

U2 - 10.1155/2022/6561622

DO - 10.1155/2022/6561622

M3 - Article

C2 - 36156967

AN - SCOPUS:85138658161

SN - 1687-5265

VL - 2022

JO - Computational Intelligence and Neuroscience

JF - Computational Intelligence and Neuroscience

M1 - 6561622

ER -

A Natural Language Processing (NLP) Evaluation on COVID-19 Rumour Dataset Using Deep Learning Techniques

Abstract

Access to Document

Other files and links

Fingerprint

Cite this