This is an unedited manuscript accepted for publication and provided as an Article in Press for early access at the author’s request. The article will undergo copyediting, typesetting, and galley proof review before final publication. Please be aware that errors may be identified during production that could affect the content. All legal disclaimers of the journal apply.
Kalyani Neve,
Dr. Padma Mishra,
Karishma Chaudhari,
Dr I. D. Paul,
- Assistant Professor, MCA Dept, G H Raisoni College of Engineering and Management, Jalgaon, , India
- Associate Professor, Thakur Institute of Management Studies, Career Development & Research, , India
- Assistant Professor, G H Raisoni College of Engineering and Management, Jalgaon, , India
- Assistant Professor, Mechanical Engineering S.S.G.B, Bhusawal, Maharashtra, India
Abstract
Since cardiovascular disease (CVD) continues to be a major global cause of morbidity and mortality, early and accurate risk prediction is essential for prompt intervention and individualized treatment. This study introduces a new hybrid transformer-based model that combines unstructured clinical narratives, structured data, and customized lifestyle characteristics. A comprehensive understanding of disease progression is made possible by the model’s ability to capture contextual, temporal, and patient- specific insights through the use of transformer architectures and sophisticated natural language processing. Clinical interpretability and transparency are guaranteed by a special explainability module. Our method combines insights from deep learning, statistical modeling, and graph-based temporal embeddings to improve prediction accuracy and clinical relevance, building on significant advancements in time-aware LSTMs, hybrid modeling, self-attention transformers, and automated machine learning from previous works [1–20]. Assessed using reference datasets, the model The model outperformed both conventional and cutting- edge methods on benchmark datasets, obtaining a Precision@5 of 78.65%, an AUC of 0.86, and an F1- score of 0.76. Its exceptional performance across comorbid conditions and demographic groups demonstrates its generalizability and practicality. This hybrid framework, which seamlessly integrates various data modalities for actionable and equitable CVD risk prediction, is a prime example of the future of predictive analytics in healthcare and advances precision medicine.
Keywords: Electronic Health Records (EHR), Hybrid Transformer Model, Temporal Data Modeling, Attention Mechanisms, ICD
Kalyani Neve, Dr. Padma Mishra, Karishma Chaudhari, Dr I. D. Paul. Combining Unstructured and Structured Clinical Data in a Hybrid Transformer Model to Enhance Cardiovascular Analytics and Clinical Decision- Making. International Journal of Bioinformatics and Computational Biology. 2026; 04(01):-.
Kalyani Neve, Dr. Padma Mishra, Karishma Chaudhari, Dr I. D. Paul. Combining Unstructured and Structured Clinical Data in a Hybrid Transformer Model to Enhance Cardiovascular Analytics and Clinical Decision- Making. International Journal of Bioinformatics and Computational Biology. 2026; 04(01):-. Available from: https://journals.stmjournals.com/ijbcb/article=2026/view=236368
References
- Johnson, A. E., Pollard, T. J., Shen, L., et al. (2016). MIMIC-III, a freely accessible critical care database.
- Scientific Data, 3, 160035. doi:10.1038/sdata.2016.35
- Huang, K., Altosaar, J., & Ranganath, R. (2019).
- ClinicalBERT: Modeling Clinical Notes and Predicting Hospital Readmission. arXiv preprint arXiv:1904.05342.
- Rajkomar, A., Oren, E., Chen, K., et al. (2018). Scalable and accurate deep learning with electronic health records. npj Digital Medicine, 1(1), 18.doi:10.1038/s41746-018-0029-1
- Baytas, I. M., Xiao, C., Zhang, X., Wang, F., et al.(2017). Patient Subtyping via Time-Aware LSTM Models. Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.
- Wang, S., et al. (2019). BiteNet: Bidirectional Encoder Network for Future Clinical Event Prediction. IEEE Transactions on Neural Networks and Learning Systems.
- Peng, J., et al. (2020). Transforming clinical event prediction with self-attention mechanism-based transformers. Journal of Biomedical Informatics.
- Zhang, Z., et al. (2021). Hybrid models for healthcare prediction: combining ICD codes with clinical narratives. Journal of Medical Systems. DOI: 10.1007/s10916-021-01630-x.
- Rajkomar, A., et al. (2018). Scalable and accurate deep learning with electronic health records. npj Digital Medicine. DOI: 10.1038/s41746-018-0029-1.
- Lipton, Z. C., Kale, D. C., Elkan, C., & Wetzel, R. (2016). Learning to diagnose with LSTM recurrent neural networks. arXiv preprint. DOI:10.48550/arXiv.1706.03762.
- Choi, E., Bahadori, M. T., Schuetz, A., Stewart, W. F., & Sun, J. (2016). Doctor AI: Predicting clinical events via recurrent neural networks. Machine Learning for Healthcare Conference.
- Vaswani, A., Shazeer, N., Parmar, N., et al. (2017). Attention is all you need. Advances in Neural Information Processing Systems. DOI:10.48550/arXiv.1706.03762.
- Miotto, R., Wang, F., Wang, S., Jiang, X., & Dudley, J. T. (2018). Deep learning for healthcare: Review, opportunities, and challenges. Briefings in Bioinformatics. DOI: 10.1093/bib/bbx044.
- Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J.F., & van der Schaar, M. (2019). Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLOS One. DOI:10.1371/journal.pone.0213653.
- Stevens, D., Lane, D. A., Harrison, S. L., Lip, G. Y. H., & Kolamunnage-Dona, R. (2021). Modelling of longitudinal data to predict cardiovascular disease risk: A methodological review. BMC Medical Research Methodology. DOI: 10.1186/s12874-021- 01288-4.
- Vishnu Vardhana Reddy Karna, et al. (2024). A Comprehensive Review on Heart Disease Risk Prediction using Machine Learning and Deep Learning Algorithms. Archives of Computational Methods in Engineering.
- Lee, S., et al. (2021). Attention-based models for cardiovascular disease prediction: A hybrid approach. IEEE Journal of Biomedical and Health Informatics.
- Hochreiter, S., Schmidhuber, J. (1997). Long Short-Term Memory. Neural Computation. DOI:10.1162/neco.1997.9.8.1735.
- Esteban, C., et al. (2016). Predicting clinical events by combining static and dynamic data in LSTM models. PLOS ONE. DOI: 10.1371/journal.pone.0146251.
- Zhang, Y., et al. (2020). THIGE: Temporal Heterogeneous Interaction Graph Embedding for Health Data Analysis. Neural Networks and Applications.
- Cai, Y., et al. (2024). Artificial intelligence in the risk prediction models of cardiovascular disease and development of an independent validation screening tool: A systematic review. BMC Medicine. DOI: 10.1186/s12916-024-02835-z.
- Deepa, R., et al. (2024). Early prediction of cardiovascular disease using machine learning: Unveiling risk factors from health records. AIP Advances. DOI: 10.1063/1.5128374.
- Miotto, R., et al. (2016). Deep Patient: An unsupervised representation to predict the future of patients from electronic health records. Scientific Reports
- Sudlow et al., “UK Biobank: An Open Access Resource for Identifying the Causes of a Wide Range of Complex Diseases of Middle and Old Age,” PLoS Med., vol. 12, no. 3, p. e1001779, Mar. 2015. Available: https://www.ukbiobank.ac.uk/enable- your-research/apply-for-access
- B. Kannel et al., “An Investigation of Coronary Heart Disease in Families: The Framingham Offspring Study,” Am. J. Epidemiol., vol. 110, no. 3, pp. 281–290, Sep. 1979. Available: https://biolincc.nhlbi.nih.gov/studies/framcohort/.
- Gerela, P., Mishra, P. N., & Vipat, R. (2022). Study on data visualization: It’s importance in education sector. International Journal of Health Sciences, 6(S3),6298–6305. https://doi.org/10.53730/ijhs.v6nS3.7393
- Mishra, P. N., Gerala, P., & Maitra, S. (2022). Study on artificial intelligence applications uses in agriculture. International Journal of Health Sciences, 6(S2),9162–9173. https://doi.org/10.53730/ijhs.v6nS2.7391
- Basavaraj, G. N., Ainapure, B., Sowmya, M. R., Sandeep, C., Mishra, P. N., Lakkimsetty, N. R., Dakulagi, V., & Shaik, F. (2025). Machine Learningenhanced Direction-of-Arrival Estimation for Coherent and Non-Coherent Sources. Engineering, Technology & Applied Science Research, 15(2), 20647–20652. https://doi.org/10.48084/etasr.9494
- Mishra, P., Gaikwad, V., Dhawan, A., Bagul, R., Shaikh, A., & Singh, R. (2025). Precision agriculture meets AI: Predicting nutritional crop outcomes from genomic data. International Journal of Environmental Sciences, 11(14S), 207–218. https://www.theaspd.com/ijes.php
- Ahmad, E., Dash, B., Tripathi, A. et al. Hybrid CNN and image processing framework for precise characterization of cracks in concrete structures. Asian J Civ Eng (2025).https://doi.org/10.1007/s42107-025-01535-0
- Vinita Gaikwad, Anamika Dhawan, P Mishra, M. Kumarasamy, “AI-Enabled Early Detection of Fetal Gestational Age and CNS Anomalies in the First Trimester through Ultrasound to Support Rural Doctors in India,” SSRG International Journal of Electronics and Communication Engineering, vol.12,no. 7,pp.174183,2025. Crossref, https://doi.org/10.14445/23488549/IJECE-V12I7P113
| Volume | 04 |
| 01 | |
| Received | 29/12/2025 |
| Accepted | 29/01/2026 |
| Published | 29/01/2026 |
| Publication Time | 31 Days |
Login
PlumX Metrics
