Predicting Dielectric Constants of Polymers Using Molecular Structural Descriptors and Explainable Machine Learning: A Data-Driven Approach

Notice

This is an unedited manuscript accepted for publication and provided as an Article in Press for early access at the author’s request. The article will undergo copyediting, typesetting, and galley proof review before final publication. Please be aware that errors may be identified during production that could affect the content. All legal disclaimers of the journal apply.

Year : 2026 | Volume : 14 | 03 | Page :
    By

    C Sridhathan,

  • R Arangasamy,

  • Geetha Prahalad,

  • Mohandass G,

  • NMG Kumar,

  • M Ram Prasad Reddy,

  1. Professor, Department of Electronics and Communication Engineering, KCG College of Technology, Karapakkam, Chennai, Tamil Nadu, India
  2. Professor, Department of Electrical and Electronics Engineering, Saveetha School of Engineering, Saveetha Institute of Medical and Technical Sciences, Saveetha University, Chennai, Tamil Nadu, India
  3. Professor, Department of Electronics and Communication Engineering, School of Engineering, Mohan Babu University, Tirupati, Andhra Pradesh, India
  4. Professor, Department of Biomedical Engineering, Saveetha School of Engineering, Saveetha Institute of Medical and Technical Sciences, Saveetha University, Chennai, Tamil Nadu, India
  5. Professor, Department of Electronics and Communication Engineering, School of Engineering, Mohan Babu University, Tirupati, Andhra Pradesh, India
  6. Professor, Department of Electrical and Electronics Engineering, Aditya College of Engineering, Madanapalle, Andhra Pradesh, India

Abstract

Accurate prediction of dielectric constants in polymeric materials is fundamental to the rational design of advanced electronic components, energy storage capacitors, flexible substrates, and high-frequency communication circuits. Conventional approaches to identifying suitable polymer dielectrics rely on extensive experimental synthesis and characterisation, which are both time-consuming and resource-intensive. In this work, an explainable machine learning framework is developed to predict the dielectric constant of polymers directly from molecular structural descriptors derived from first-principles calculations. The study exclusively utilises the Khazana Polymer Dataset — a publicly available, CC0-licensed repository of 1,073 polymers with density functional theory (DFT)-computed dielectric constants, band gaps, and atomisation energies — to construct and rigorously validate the predictive model. Eight structural descriptors including band gap, atomisation energy, unit cell volume, atom count per repeat unit, electronegativity variance, oxygen-to-carbon ratio, halogen presence, and metal-element flag are extracted from CIF crystal structure files using the pymatgen library. A gradient boosting regression (GBR) model optimised via five-fold cross-validated grid search achieves R² = 0.931, RMSE = 0.43, and MAPE = 4.7% on the held-out test set, outperforming random forest, support vector regression, and linear regression baselines across all metrics. SHAP (Shapley Additive Explanations) feature attribution identifies band gap and atomisation energy as the dominant predictors of dielectric behaviour, providing physically interpretable and chemically actionable structure–property design rules. The proposed framework offers a scalable, fully reproducible pathway for large-scale polymer dielectric screening and data-driven inverse design.

Keywords: Polymer dielectrics; Dielectric constant prediction; Molecular structural descriptors; Explainable machine learning; Gradient boosting regression; SHAP; Khazana dataset; Structure–property relationships; Band gap; Polymer informatics

How to cite this article:
C Sridhathan, R Arangasamy, Geetha Prahalad, Mohandass G, NMG Kumar, M Ram Prasad Reddy. Predicting Dielectric Constants of Polymers Using Molecular Structural Descriptors and Explainable Machine Learning: A Data-Driven Approach. Journal of Polymer & Composites. 2026; 14(03):-.
How to cite this URL:
C Sridhathan, R Arangasamy, Geetha Prahalad, Mohandass G, NMG Kumar, M Ram Prasad Reddy. Predicting Dielectric Constants of Polymers Using Molecular Structural Descriptors and Explainable Machine Learning: A Data-Driven Approach. Journal of Polymer & Composites. 2026; 14(03):-. Available from: https://journals.stmjournals.com/jopc/article=2026/view=243072


References

[1]  P. Barber et al., “Polymer composite and nanocomposite dielectric materials for pulse power energy storage,” Materials, vol. 2, no. 4, pp. 1697–1733, 2009.

[2]  M. Supova, M. Polanský, and S. Marton, “Tuning of dielectric properties of polymers by composite formation,” Polymers, vol. 6, no. 12, p. 355, 2014.

[3]  M. Zhu et al., “Review of machine learning-driven design of polymer-based dielectrics,” J. Phys. D: Appl. Phys., vol. 54, p. 173001, 2021.

[4]  A. Mannodi-Kanakkithodi et al., “Accelerated materials property predictions and design using motif-based fingerprints,” Phys. Rev. B, vol. 92, p. 014106, 2015.

[5]  L. Chen et al., “Frequency-dependent dielectric constant prediction of polymers using machine learning,” npj Comput. Mater., vol. 6, p. 61, 2020.

[6]  C. Kim et al., “Polymer genome: a data-powered polymer informatics platform for property predictions,” Chem. Mater., vol. 30, pp. 1418–1429, 2018.

[7]  A. Mannodi-Kanakkithodi et al., “Machine learning strategy for accelerated design of polymer dielectrics,” Sci. Rep., vol. 6, p. 20952, 2016.

[8]  L. Chen et al., “Frequency-dependent dielectric constant prediction of polymers using machine learning,” npj Comput. Mater., vol. 6, p. 61, 2020.

[9]  A. Chandrasekaran et al., “Polymer informatics with multi-task learning,” Patterns, vol. 2, p. 100238, 2021.

[10] A. Mannodi-Kanakkithodi et al., “Rational co-design of polymer dielectrics for energy storage,” Adv. Mater., vol. 28, pp. 6277–6291, 2016.

[11] T. D. Huan et al., “A polymer dataset for accelerated property prediction and design,” Sci. Data, vol. 3, p. 160012, 2016. DOI: 10.1038/sdata.2016.12

[12] T. D. Huan et al., Data from: A polymer dataset for accelerated property prediction and design [Dataset]. Dryad, 2017. DOI: 10.5061/dryad.5ht3n

[13] R. Batra et al., “Emerging materials intelligence ecosystems propelled by machine learning,” Nat. Rev. Mater., vol. 6, pp. 655–678, 2021.

[14] S. P. Ong et al., “Python Materials Genomics (pymatgen),” Comput. Mater. Sci., vol. 68, pp. 314–319, 2013.

[15] D. R. Penn, “Wave-number-dependent dielectric function of semiconductors,” Phys. Rev., vol. 128, p. 2093, 1962.

[16] J. H. Friedman, “Greedy function approximation: a gradient boosting machine,” Ann. Stat., vol. 29, no. 5, pp. 1189–1232, 2001.

[17] F. Pedregosa et al., “Scikit-learn: machine learning in Python,” J. Mach. Learn. Res., vol. 12, pp. 2825–2830, 2011.

[18] S. M. Lundberg and S.-I. Lee, “A unified approach to interpreting model predictions,” in Proc. NeurIPS, 2017, pp. 4765–4774.

[19] S. M. Lundberg et al., “From local explanations to global understanding with explainable AI for trees,” Nat. Mach. Intell., vol. 2, pp. 56–67, 2020.

[20] Y. Katsura et al., “Data-driven analysis of dielectric properties,” Sci. Technol. Adv. Mater., vol. 20, pp. 744–762, 2019.

[21] NIMS, “PoLyInfo: Polymer database.” [Online]. Available: https://polymer.nims.go.jp


Ahead of Print Subscription Original Research
Volume 14
03
Received 20/03/2026
Accepted 02/05/2026
Published 06/05/2026
Publication Time 47 Days


Login


My IP

PlumX Metrics