M. Prasad,
Rajarao PBV,
P. Kiran Sree,
P. Gowthami,
K. Ajita Lakshmi,
B. Subrahmanyam,
- Associate Professor, Department of Computer Science Engineering, Shri Vishnu Engineering College for Women (A), Bhimavaram, Andhra Pradesh, India
- Associate Professor, Department of Computer Science Engineering, Shri Vishnu Engineering College for Women (A), Bhimavaram, Andhra Pradesh, India
- Professor and Head, Department of Computer Science Engineering, Shri Vishnu Engineering College for Women (A), Bhimavaram, Andhra Pradesh, India
- Student, Department of Computer Science Engineering, Shri Vishnu Engineering College for Women (A), Bhimavaram, Andhra Pradesh, India
- Assistant Professor, Department of Electronics and Communication Engineering, Shri Vishnu Engineering College for Women(A), Bhimavaram, Andhra Pradesh, India
- Assitant Professor, Department of MCA, BVC Institute of Technology and Science(A), Batlapalem, Andhra Pradesh, India
Abstract
Nowadays everyone uses social media platforms like X (formerly Twitter), Instagram, Facebook, etc. for various purposes. With the help of this, we share our opinions, ideas, and feelings. Generally, the datasets obtained from the internet are constructive; however, there is a significant proportion of toxic ones. The datasets are filtered to remove noise, and noise is removed in post-processing. The study initiates with the upload and preprocessing of a toxic comment dataset meticulously cleaning text by eliminating stop words and special symbols to lay out a standardized corpus. The resulting application of count vectorizer captures word occurrences constructing a feature matrix for algorithmic training supervised algorithms, including support vector machine, logistic regression, naive Bayes, random forest, decision tree, and k-nearest neighbors. These are systematically implemented and each algorithm undergoes rigorous assessment with accuracy measurements computed to check its proficiency in segregating toxic from non-toxic comments. The examination finishes in an accuracy graph visually contrasting the performance of the various supervised algorithms. This visual representation helps in identifying the most effective model for online toxic comment classification. The multiheaded model consists of toxicity, severe-toxic, obscene threat insults, and toxicity prediction based on confusion metrics. The practical implications of this study lie in outfitting a robust tool for online platforms to automatically detect and manage toxic comments adding to a safer and more constructive digital environment.
Keywords: Toxic comments, support vector machine (SVM), k-nearest neighbors (KNN), multiheaded model, supervised learning
[This article belongs to Journal of Mobile Computing, Communications & Mobile Networks ]
M. Prasad, Rajarao PBV, P. Kiran Sree, P. Gowthami, K. Ajita Lakshmi, B. Subrahmanyam. A Supervised Learning Approach for Toxic Comment Detection on Social Media Platforms. Journal of Mobile Computing, Communications & Mobile Networks. 2024; 11(02):7-14.
M. Prasad, Rajarao PBV, P. Kiran Sree, P. Gowthami, K. Ajita Lakshmi, B. Subrahmanyam. A Supervised Learning Approach for Toxic Comment Detection on Social Media Platforms. Journal of Mobile Computing, Communications & Mobile Networks. 2024; 11(02):7-14. Available from: https://journals.stmjournals.com/jomccmn/article=2024/view=155358
References
- Badjatiya P, Gupta S, Gupta M, Varma V. Deep learning for hate speech detection in tweets. In: Proceedings of the 26th International Conference on World Wide Web Companion, Perth, Australia, April 3–7, 2017. pp. 759–760.
- Prusa J, Khoshgoftaar TM, Dittman DJ, Napolitano A. Using random undersampling to alleviate class imbalance on tweet sentiment data. In: Proceedings of the IEEE International Conference on Information Reuse and Integration, August 13–15, 2015. pp. 197–202.
- Waseem Z, Davidson T, Warmsley D, Weber I. A typology of abusive language detection subtasks for understanding abuse. In: Proceedings of the the First Workshop on Abusive Language Online. Vancouver, British Columbia, Canada: Association for Computational Lingistics; 2017. pp. 78–84.
- Rustam F, Khalid M, Aslam W, Rupapara V, Mehmood A, Choi GS. A performance comparison of supervised machine learning models for COVID-19 tweets sentiment analysis. PLoS One. 2021; 16 (2): e0245909.
- Fatima EB, Omar B, Abdelmajid EM, Rustam F, Mehmood A, Choi GS. Minimizing the overlapping degree to improve class imbalanced learning under sparse feature selection: application to fraud detection. IEEE Access. 2021; 9: 28101–28110.
- Rustam F, Ashraf I, Mehmood A, Ullah S, Choi GS. Tweets classification on the base of sentiments for US airline companies. Entropy. 2019; 21 (11): 1078.
- Anandarajan M, Hill C, Nolan T. Practical Text Analytics: Maximizing the Value of Text Data. Advances in Analytics and Data Science, vol. 2. Cham, Switzerland: Springer; 2019.
- Raja Rao PBV, Prasad M, Kiran Sree P, Venkata Ramana C, Satyanarayana Murty PT. Enhancing the MANET AODV forecast of a broken link with LBP. In: Reddy VS, Prasad VK, Wang J, Rao Dasari NM, editors. Intelligent Systems and Sustainable Computing. ICISSC 2022. Smart Innovation, Systems and Technologies, vol. 363. Singapore: Springer; 2023. pp. 51–69.. doi: 10.1007/978-981-99-4717-1_6.
- Maddula P, Srikanth P, Kiran Sree P, Raja Rao PBV, Satyanarayana Murty PT. COVID-19 prediction with chest X-ray images using CNN. In: 2023 International Conference on Intelligent and Innovative Technologies in Computing, Electrical and Electronics (IITCEE), Bengaluru, India, January 27–28, 2023. pp. 568–572. doi: 10.1109/IITCEE57236.2023.10090951.
- Prasad M, Ajita Lakshmi K, Para Rao PBV, Prasanthi BV, Kiran Sree P, Rajesh Babu V, Das GS. A CNN and TF techniques development for efficient identification of floral recognition. In: 2024 IEEE International Conference on Computing, Power and Communication Technologies (IC2PCT), Greater Noida, India, February 9–10, 2024. pp. 327–332. doi: 10.1109/IC2PCT60090.2024.10486528.
- Satyanarayana Murty PT, Prasad M, Raja Rao PBV, Kiran Sree P, Ramesh Babu G, Varma CP. A hybrid intelligent cryptography algorithm for distributed big data storage in cloud computing security. In: Morusupalli R, Dandibhotla TS, Atluri VV, Windridge D, Lingras P, Komati VR, editors. Multi-disciplinary Trends in Artificial Intelligence. MIWAI 2023. Lecture Notes in Computer Science, vol 14078. Cham, Switzerland: Springer; 2023. pp. 637–648. doi: 10.1007/978-3-031-36402-0_59.
- Kiran Sree P, Chintalapati PV, Usha Devi SSSN, Prasad M, Ramesh Babu G, Raja Rao PBV. Waste management detection using deep learning. In: 2023 3rd International Conference on Computing and Information Technology (ICCIT), Tabuk, Saudi Arabia, September 13–14, 2023. pp. 50–54. doi: 10.1109/ICCIT58132.2023.10273898.

Journal of Mobile Computing, Communications & Mobile Networks
| Volume | 11 |
| Issue | 02 |
| Received | 03/05/2024 |
| Accepted | 09/05/2024 |
| Published | 05/07/2024 |
Login
PlumX Metrics