Combining Unstructured and Structured Clinical Data in a Hybrid Transformer Model to Enhance Cardiovascular Analytics and Clinical Decision- Making

Notice

This is an unedited manuscript accepted for publication and provided as an Article in Press for early access at the author’s request. The article will undergo copyediting, typesetting, and galley proof review before final publication. Please be aware that errors may be identified during production that could affect the content. All legal disclaimers of the journal apply.

Year : 2026 | Volume : 04 | 01 | Page :
    By

    Kalyani Neve,

  • Dr. Padma Mishra,

  • Karishma Chaudhari,

  • Dr I. D. Paul,

  1. Assistant Professor, MCA Dept, G H Raisoni College of Engineering and Management, Jalgaon, , India
  2. Associate Professor, Thakur Institute of Management Studies, Career Development & Research, , India
  3. Assistant Professor, G H Raisoni College of Engineering and Management, Jalgaon, , India
  4. Assistant Professor, Mechanical Engineering S.S.G.B, Bhusawal, Maharashtra, India

Abstract

Since cardiovascular disease (CVD) continues to be a major global cause of morbidity and mortality, early and accurate risk prediction is essential for prompt intervention and individualized treatment. This study introduces a new hybrid transformer-based model that combines unstructured clinical narratives, structured data, and customized lifestyle characteristics. A comprehensive understanding of disease progression is made possible by the model’s ability to capture contextual, temporal, and patient- specific insights through the use of transformer architectures and sophisticated natural language processing. Clinical interpretability and transparency are guaranteed by a special explainability module. Our method combines insights from deep learning, statistical modeling, and graph-based temporal embeddings to improve prediction accuracy and clinical relevance, building on significant advancements in time-aware LSTMs, hybrid modeling, self-attention transformers, and automated machine learning from previous works [1–20]. Assessed using reference datasets, the model The model outperformed both conventional and cutting- edge methods on benchmark datasets, obtaining a Precision@5 of 78.65%, an AUC of 0.86, and an F1- score of 0.76. Its exceptional performance across comorbid conditions and demographic groups demonstrates its generalizability and practicality. This hybrid framework, which seamlessly integrates various data modalities for actionable and equitable CVD risk prediction, is a prime example of the future of predictive analytics in healthcare and advances precision medicine.

Keywords: Electronic Health Records (EHR), Hybrid Transformer Model, Temporal Data Modeling, Attention Mechanisms, ICD

How to cite this article:
Kalyani Neve, Dr. Padma Mishra, Karishma Chaudhari, Dr I. D. Paul. Combining Unstructured and Structured Clinical Data in a Hybrid Transformer Model to Enhance Cardiovascular Analytics and Clinical Decision- Making. International Journal of Bioinformatics and Computational Biology. 2026; 04(01):-.
How to cite this URL:
Kalyani Neve, Dr. Padma Mishra, Karishma Chaudhari, Dr I. D. Paul. Combining Unstructured and Structured Clinical Data in a Hybrid Transformer Model to Enhance Cardiovascular Analytics and Clinical Decision- Making. International Journal of Bioinformatics and Computational Biology. 2026; 04(01):-. Available from: https://journals.stmjournals.com/ijbcb/article=2026/view=236368


References

  1. Johnson, A. E., Pollard, T. J., Shen, L., et al. (2016). MIMIC-III, a freely accessible critical care database.
  2. Scientific Data, 3, 160035. doi:10.1038/sdata.2016.35
  3. Huang, K., Altosaar, J., & Ranganath, R. (2019).
  4. ClinicalBERT: Modeling Clinical Notes and Predicting Hospital Readmission. arXiv preprint arXiv:1904.05342.
  5. Rajkomar, A., Oren, E., Chen, K., et al. (2018). Scalable and accurate deep learning with electronic health records. npj Digital Medicine, 1(1), 18.doi:10.1038/s41746-018-0029-1
  6. Baytas, I. M., Xiao, C., Zhang, X., Wang, F., et al.(2017). Patient Subtyping via Time-Aware LSTM Models. Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.
  7. Wang, S., et al. (2019). BiteNet: Bidirectional Encoder Network for Future Clinical Event Prediction. IEEE Transactions on Neural Networks and Learning Systems.
  8. Peng, J., et al. (2020). Transforming clinical event prediction with self-attention mechanism-based transformers. Journal of Biomedical Informatics.
  9. Zhang, Z., et al. (2021). Hybrid models for healthcare prediction: combining ICD codes with clinical narratives. Journal of Medical Systems. DOI: 10.1007/s10916-021-01630-x.
  10. Rajkomar, A., et al. (2018). Scalable and accurate deep learning with electronic health records. npj Digital Medicine. DOI: 10.1038/s41746-018-0029-1.
  11. Lipton, Z. C., Kale, D. C., Elkan, C., & Wetzel, R. (2016). Learning to diagnose with LSTM recurrent neural networks. arXiv preprint. DOI:10.48550/arXiv.1706.03762.
  12. Choi, E., Bahadori, M. T., Schuetz, A., Stewart, W. F., & Sun, J. (2016). Doctor AI: Predicting clinical events via recurrent neural networks. Machine Learning for Healthcare Conference.
  13. Vaswani, A., Shazeer, N., Parmar, N., et al. (2017). Attention is all you need. Advances in Neural Information        Processing        Systems.         DOI:10.48550/arXiv.1706.03762.
  14. Miotto, R., Wang, F., Wang, S., Jiang, X., & Dudley, J. T. (2018). Deep learning for healthcare: Review, opportunities, and challenges. Briefings in Bioinformatics. DOI: 10.1093/bib/bbx044.
  15. Alaa, A. M., Bolton, T., Di Angelantonio, E., Rudd, J.F., & van der Schaar, M. (2019). Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants. PLOS One. DOI:10.1371/journal.pone.0213653.
  16. Stevens, D., Lane, D. A., Harrison, S. L., Lip, G. Y. H., & Kolamunnage-Dona, R. (2021). Modelling of longitudinal data to predict cardiovascular disease risk: A methodological review. BMC Medical Research Methodology. DOI: 10.1186/s12874-021- 01288-4.
  17. Vishnu Vardhana Reddy Karna, et al. (2024). A Comprehensive Review on Heart Disease Risk Prediction using Machine Learning and Deep Learning Algorithms. Archives of Computational Methods in Engineering.
  18. Lee, S., et al. (2021). Attention-based models for cardiovascular disease prediction: A hybrid approach. IEEE Journal of Biomedical and Health Informatics.
  19. Hochreiter, S., Schmidhuber, J. (1997). Long Short-Term     Memory.     Neural     Computation.     DOI:10.1162/neco.1997.9.8.1735.
  20. Esteban, C., et al. (2016). Predicting clinical events by combining static and dynamic data in LSTM models. PLOS ONE. DOI: 10.1371/journal.pone.0146251.
  21. Zhang, Y., et al. (2020). THIGE: Temporal Heterogeneous Interaction Graph Embedding for Health Data Analysis. Neural Networks and Applications.
  22. Cai, Y., et al. (2024). Artificial intelligence in the risk prediction models of cardiovascular disease and development of an independent validation screening tool: A systematic review. BMC Medicine. DOI: 10.1186/s12916-024-02835-z.
  23. Deepa, R., et al. (2024). Early prediction of cardiovascular disease using machine learning: Unveiling risk factors from health records. AIP Advances. DOI: 10.1063/1.5128374.
  24. Miotto, R., et al. (2016). Deep Patient: An unsupervised representation to predict the future of patients from electronic health records. Scientific Reports
  25. Sudlow et al., “UK Biobank: An Open Access Resource for Identifying the Causes of a Wide Range of Complex Diseases of Middle and Old Age,” PLoS Med., vol. 12, no. 3, p. e1001779, Mar. 2015. Available: https://www.ukbiobank.ac.uk/enable- your-research/apply-for-access
  26. B. Kannel et al., “An Investigation of Coronary Heart Disease in Families: The Framingham Offspring Study,” Am. J. Epidemiol., vol. 110, no. 3, pp. 281–290, Sep. 1979. Available: https://biolincc.nhlbi.nih.gov/studies/framcohort/.
  27. Gerela, P., Mishra, P. N., & Vipat, R. (2022). Study on data visualization: It’s importance in education sector. International Journal of Health Sciences, 6(S3),6298–6305. https://doi.org/10.53730/ijhs.v6nS3.7393
  28. Mishra, P. N., Gerala, P., & Maitra, S. (2022). Study on artificial intelligence applications uses in agriculture. International Journal of Health Sciences, 6(S2),9162–9173. https://doi.org/10.53730/ijhs.v6nS2.7391
  29. Basavaraj, G. N., Ainapure, B., Sowmya, M. R., Sandeep, C., Mishra, P. N., Lakkimsetty, N. R., Dakulagi, V., & Shaik, F. (2025). Machine Learningenhanced Direction-of-Arrival Estimation for Coherent and Non-Coherent Sources. Engineering, Technology & Applied Science Research, 15(2), 20647–20652. https://doi.org/10.48084/etasr.9494
  30. Mishra, P., Gaikwad, V., Dhawan, A., Bagul, R., Shaikh, A., & Singh, R. (2025). Precision agriculture meets AI: Predicting nutritional crop outcomes from genomic data. International Journal of Environmental Sciences,                   11(14S),                   207–218. https://www.theaspd.com/ijes.php
  31. Ahmad, E., Dash, B., Tripathi, A. et al. Hybrid CNN and image processing framework for precise characterization of cracks in concrete structures. Asian J Civ Eng (2025).https://doi.org/10.1007/s42107-025-01535-0
  32. Vinita Gaikwad, Anamika Dhawan, P Mishra, M. Kumarasamy, “AI-Enabled Early Detection of Fetal Gestational Age and CNS Anomalies in the First Trimester through Ultrasound to Support Rural Doctors in India,” SSRG International Journal of Electronics and Communication Engineering, vol.12,no. 7,pp.174183,2025. Crossref, https://doi.org/10.14445/23488549/IJECE-V12I7P113

Ahead of Print Subscription Review Article
Volume 04
01
Received 29/12/2025
Accepted 29/01/2026
Published 29/01/2026
Publication Time 31 Days


Login


My IP

PlumX Metrics