Effect of Corpora on Classification of Fake News using Naive Bayes Classifier

  • Farzana Islam Adiba Department of Computer Science and Engineering, Jahangirnagar University, Dhaka, Bangladesh
  • Tahmina Islam Department of Computer Science and Engineering, Jahangirnagar University, Dhaka, Bangladesh
  • M Shamim Kaiser Institute of Information Technology, Jahangirnagar University, Dhaka, Bangladesh
  • Mufti Mahmud Department of Computing & Technology, Nottingham Trent University, Nottingham, UK
  • Muhammad Arifur Rahman Associate Professor, Department of Physics, Jahangirnagar University, Dhaka, Bangladesh
Keywords: Fake news, Social media, Machine learning, Natural language processing, Naive bayes

Abstract

At the present world, one of the main sources of the news is an online platform like different websites and social media i.e. Facebook, Twitter, Linkedin, Youtube, Instagram and so on. However, due to the lack of proper knowledge or deliberate activity of some cunning people, fake news is spreading more than ever. People in general, struggling to filter which news to trust and which one to discard. Even the sly people take advantage of the situation by spreading false news and misleading the people. Natural Language Processing, one of the major branch of Machine Learning, the wealth of research is remarkable. However, new challenges underpinning this development. Here in this work, Naive Bayes Classifier, a Bayesian approach of Machine Learning algorithm has applied to identify the fake news. We showed, besides the algorithms, how the wealth of corpora can assist to improve the performance. The dataset collected from an open-source, has been used to classify whether the news is authenticated or not. Initially, we achieved classification accuracy about 87% which is higher than previously reported accuracy and then 92% by the same Naive Bayes Algorithm with enriched corpora.

Published
2020-10-30