Md. Nuzmul Hossain Nahid,
Md. Abdul Based,
- Researcher, Department of CSE, Dhaka International University, Dhaka, Bangladesh
- Professor & Chairman, Department of CSE, Dhaka International University, Dhaka, Bangladesh
Abstract
Excessively high blood glucose levels lead to diabetes, a condition that can be better managed with early detection, resulting in a longer life and improved health. Machine learning models are essential tools in diagnosing diabetes, especially when trained on appropriate and relevant datasets. In this study, a combination of ensemble methods and nine distinct machine learning algorithms were utilized to develop a predictive model for diabetes diagnosis based on a publicly accessible dataset. Among the models tested, the Random Forest algorithm demonstrated superior performance, achieving the highest prediction accuracy of 99.75%. This highlights the effectiveness of ensemble-based approaches in enhancing diagnostic precision and underscores the potential of machine learning in supporting clinical decision-making for diabetes detection. The study emphasizes the value of data-driven techniques in improving the early identification and management of diabetes. A comparison with existing studies highlights the strength and superiority of our approach. Additionally, a user-friendly web application has been developed using the best-performing model, providing users with diabetes predictions and relevant educational videos.
Keywords: Diabetes, Machine Learning, Random Forest, Accuracy, Web Application
[This article belongs to Research & Reviews: A Journal of Bioinformatics ]
Md. Nuzmul Hossain Nahid, Md. Abdul Based. Enhanced Diabetes Prediction: A Comparative Study of Machine Learning Models. Research & Reviews: A Journal of Bioinformatics. 2025; 12(02):1-10.
Md. Nuzmul Hossain Nahid, Md. Abdul Based. Enhanced Diabetes Prediction: A Comparative Study of Machine Learning Models. Research & Reviews: A Journal of Bioinformatics. 2025; 12(02):1-10. Available from: https://journals.stmjournals.com/rrjobi/article=2025/view=215788
References
- Bothra R. Diabetes prediction using machine learning algorithms. Int J Eng Appl Sci Technol. 2021;6(5):151–4. ISSN: 2455-2143.
- Sahoo J, Dash M, Pati A. Diabetes prediction using machine learning classification algorithms. Int Res J Eng Technol (IRJET). 2020 Aug;7(8):e-ISSN: 2395-0056.
- Patel KU, Sunyecz IL, McCallinhart PE, Bartlett CW, Trask AJ. Applied predictive modeling of coronary microvascular disease using coronary Doppler and cardiac echocardiography. FASEB J. 2018 Apr;32(S1). doi:10.1096/fasebj.2018.32.1_supplement.784.9.
- Mitushi S, Sunita V. Diabetes prediction using machine learning techniques. Int J Eng Res Technol (IJERT). 2020;9(1). ISSN: 2278-0181.
- Faruque MF, Sarker IH. Performance analysis of machine learning techniques to predict diabetes mellitus. In: 2019 Int Conf on Electrical, Computer and Communication Engineering (ECCE); 2019 Feb. p. 1–6.
- Xue J, Min F, Ma F. Research on diabetes prediction method based on machine learning. J Phys Conf Ser. 2020;1684(1):012062.
- Sneha N, Gangil T. Analysis of diabetes mellitus for early prediction using optimal features selection. Big Data. 2019;6(1):13. doi:10.1186/s40537-019-0175-6.
- Baby ST, Karunakaran V. Prediction of diabetics using machine learning classifiers: a review. In: 2021 5th Int Conf on I-SMAC (IoT in Social, Mobile, Analytics and Cloud); 2021 Nov. p. 735–9.
- Shafi S, Ansari GA. Early prediction of diabetes disease & classification of algorithms using machine learning approach. In: Proc Int Conf on Smart Data Intelligence; 2021 May. p. 453–8.
- Rani KJ. Diabetes prediction using machine learning. Int J Sci Res Comput Sci Eng Inf Technol. 2020; DOI:10.32628/CSEIT206463.
- Premamayudu B, Muralikrishna K, Pramodh K. Diabetes prediction using machine learning KNN-algorithm technique. Int J Innov Sci Res Technol. 2022 May;7(5). ISSN: 2456-2165.
- Llaha O, Rista A. Prediction and detection of diabetes using machine learning. In: Proc 4th Int Conf on Recent Trends and Applications in Computer Science and Information Technology; 2021 May.
- National Institute of Diabetes and Digestive and Kidney Diseases. Available from: https://www.niddk.nih.gov/. Accessed 10 Jan 2025.
- Jakkula V. Tutorial on support vector machine (SVM). Pullman, WA: School of EECS, Washington State University; 2006.
- Charbuty B, Abdulazeez A. Classification based on decision tree algorithm for machine learning. J Appl Sci Technol Trends. 2021 Mar;2(1):20–8. doi:10.38094/jastt20165.
- Sarica A, Cerasa A, Quattrone A. Random forest algorithm for the classification of neuroimaging data in Alzheimer’s disease: a systematic review. Front Aging Neurosci. 2017;9:329.
- Vijayarani S, Dhayanand S. Liver disease prediction using SVM and Naïve Bayes algorithms. Int J Sci Eng Technol Res. 2015 Apr;4(4):816–20.
- Imandoust SB, Bolandraftar M. Application of k-nearest neighbor (KNN) approach for predicting economic events. Int J Eng Res Appl. 2013 Sep–Oct;3(5):605–10.
- Shevade SK, Keerthi SS. A simple and efficient algorithm for gene selection using sparse logistic regression. Bioinformatics. 2003 Dec;19(17):2246–53. doi:10.1093/bioinformatics/btg308. PMID: 14630653.
- Sevinç E. An empowered AdaBoost algorithm implementation: a COVID-19 dataset study. Comput Ind Eng. 2022 Mar;165:107912.
- Zhou F, Pan H, Gao Z, Huang X, Qian G, Zhu Y, et al. Fire prediction based on CatBoost algorithm. Math Probl Eng. 2021;2021:1929137.
- Ahamed BS. Prediction of type-2 diabetes using the LGBM classifier methods and techniques. Turk J Comput Math Educ (TURCOMAT). 2021;12(12):2807–13.

Research & Reviews: A Journal of Bioinformatics
| Volume | 12 |
| Issue | 02 |
| Received | 27/05/2025 |
| Accepted | 17/06/2025 |
| Published | 17/07/2025 |
| Publication Time | 51 Days |
Login
PlumX Metrics