Identifying Bangla deceptive news using machine learning and deep learning algorithms

Citation

Abstract

Internet-based resources are utilized by the vast majority of individuals today. The news published on websites and shared on social media platforms are examples of such resources. Due to the increasing number of content creators, online media portals, and news portals, it has become nearly impossible to verify the veracity of news headlines and undertake thorough assessments of them. The overwhelming majority of fraudulent headlines contain misleading or false information. They obtain more views and shares from people of all ages by using clickbait titles that contain fictitious terms or false information. However, these false and misleading headlines cause chaos in the lives of the average individual and mislead them in numerous ways. We have used recent Bangla news articles to create a model that can accurately determine the reliability of the news. In order to detect fake Bangla news stories, we have used approximately 10,000 news articles to train our machine learning and deep learning model. In addition, the Bengali language uses BNLP and BLTK for a wide range of natural language processing activities and bn_w2v_wiki a word embedding model for Bangla Language to represent words as vectors. The Synthetic Minority Oversampling Strategy (SMOTE) was used to remove the imbalance of our dataset. On the training data of our dataset, we have employed machine learning in addition to deep learning algorithm. Our deep learning model LSTM performs best with the accuracy of 91% . Also our machine learning model Random Forest and Support Vector Machine performs well enough to compete with LSTM for the prediction of fake news. The other machine learning algorithms included are LR, KNN, GNB, bagging, boosting. Furthermore, we have developed a website that takes Bangla news text as input and classifies the news with the help of our trained model. We believe our study will go a long way towards establishing a foundation in the research field of low resourced Bangla Language and open new door to future study.

Description

Cataloged from PDF version of thesis.
Includes bibliographical references (pages 53-55).
This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science, 2023.

Publisher Link

Type

Thesis