Enhancing email management and filtering through naive bayes based spam detection : a proposed email application solution

Citation

Abstract

Spam emails make up almost half of the global mail traffic. These emails take up a large amount of space in the user’s inbox. However, a lot of time malware and viruses are embedded in these emails in the form of attachments or phishing links provided within the disguised emails. Moreover, sometimes important emails are flagged as spam and they are sent to spam folders causing that email to go completely unnoticed. Sometimes, even emails from educational institutions may follow a similar fate. Our goal is to build a machine learning-based email management system that not only effectively sorts and organizes emails into user-preferred categories but also contains an improved spam email detection system to ensure that important ones do not find their way into the spam folders in our inboxes. Our proposed model detects and blocks spam emails by understanding their context of it. E-mails containing newsletters, advertisements, and updates can be enhanced by adding a feature that enables machine learning to filter the email based on individual user preferences. Unwanted ones will be detected and blocked from entering the user’s inbox, consequently saving space. We plan to build our proposed system using the Naive Bayes algorithm which is a computational technique employed to assess the significance of an email concerning our needs. It is a probabilistic algorithm grounded in the principles of Bayes Theorem, specifically developed to filter out spam emails for enhanced classification of the emails. An important benefit of this method for spam filtering is its adaptability to individual users, as we get more and more feedback from users it can improve its prediction. This study explores the identification of spam and non-spam emails through the utilization of the Naive Bayes algorithm. Thus, Our system learns more about the preferences of the user as time passes and can optimize its functionality accordingly. We also aim to build a web application that will make the whole process of identifying and separating emails smoother for the user.

Description

Cataloged from PDF version of thesis.
Includes bibliographical references (pages 59-60).
This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science, 2023.

Publisher Link

Type

Thesis