This is an unedited manuscript accepted for publication and provided as an Article in Press for early access at the author’s request. The article will undergo copyediting, typesetting, and galley proof review before final publication. Please be aware that errors may be identified during production that could affect the content. All legal disclaimers of the journal apply.
Dhananjay M Kanade,
Shirish S. Sane,
- Research Scholar, Department of Computer Engineering, K. K. Wagh Institute of Engineering Education and Research, Nashik, Savitribai Phule Pune University, Pune, Maharashtra, India
- Professor and Research Guide, Department of Computer Engineering, Gokhale Education Society’s R H Sapat College of Engineering, Management Studies and Research, Nashik, Maharashtra, India
Abstract
The rapid growth and adoption of modern database systems have created immense opportunities for researchers, industries, and organizations to extract meaningful knowledge and make data-driven decisions. While this progress has enabled the discovery of valuable patterns and trends, it has also intensified the challenge of safeguarding individual privacy. Merely removing direct identifiers such as names, social security numbers, or Aadhar card details is no longer sufficient, as adversaries can often exploit quasi-identifiers like gender, date of birth, and postal codes to re-identify individuals with alarming accuracy. To address these vulnerabilities, the field of Privacy-Preserving Data Publishing (PPDP) has emerged, offering techniques that attempt to strike a delicate balance between maintaining data utility and ensuring strong privacy guarantees. This paper provides a detailed exploration of prominent PPDP models, including k-anonymity, ℓ-diversity, and t-closeness, while also reviewing newer strategies such as β-likeness and disassociation. Each method’s strengths, weaknesses, and real-world applicability are critically assessed. In addition, the paper highlights the inherent trade-offs between data protection and usability, underlining the importance of adaptive, efficient, and context-aware solutions for secure data sharing in an era of growing privacy risks.
Keywords: Data Anonymization, Data Utility, Privacy-Preserving Data Publishing (PPDP), General Data Protection Regulation (GDPR).
Dhananjay M Kanade, Shirish S. Sane. An Effective Privacy Preservation Technique for Enhancing Data Usability. Recent Trends in Parallel Computing. 2025; 12(03):-.
Dhananjay M Kanade, Shirish S. Sane. An Effective Privacy Preservation Technique for Enhancing Data Usability. Recent Trends in Parallel Computing. 2025; 12(03):-. Available from: https://journals.stmjournals.com/rtpc/article=2025/view=232411
References
- Vanichayavisalsakul P, Piromsopa K. An evaluation of anonymized models and ensemble classifiers. InProceedings of the 2018 2nd international conference on big data and internet of things 2018 Oct 24 (pp. 18-22).
- Sweeney L. k-anonymity: A model for protecting privacy. International journal of uncertainty, fuzziness and knowledge-based systems. 2002 Oct;10(05):557-70.
- Machanavajjhala A, Kifer D, Gehrke J, Venkitasubramaniam M. l-diversity: Privacy beyond k- anonymity. Acm transactions on knowledge discovery from data (tkdd). 2007 Mar 1;1(1):3-es.
- Li N, Li T, Venkatasubramanian S. t-closeness: Privacy beyond k-anonymity and l-diversity. In2007 IEEE 23rd international conference on data engineering 2006 Apr 15 (pp. 106-115). IEEE.
- Xie L, Lin K, Wang S, Wang F, Zhou J. Differentially private generative adversarial network. arXiv preprint arXiv:1802.06739. 2018 Feb 19.
- Kanade DM, Sane SS. Evaluating the Effectiveness of Clustering-Based K-Anonymity and KNN Cluster for Privacy Preservation. Int J Intell Syst Appl Eng [Internet]. 2023 Sep. 6 [cited 2025 Sep. 23];11(11s):85-93. Available from: https://ijisae.org/index.php/IJISAE/article/view/3437
- Reiter JP. Using CART to generate partially synthetic public use microdata. Journal of official statistics. 2005 Sep 1;21(3):441.
- Hittmeir M, Ekelhart A, Mayer R. On the utility of synthetic data: An empirical evaluation on machine learning tasks. InProceedings of the 14th international conference on availability, reliability and security 2019 Aug 26 (pp. 1-6).
- Majeed A, Lee S. Attribute susceptibility and entropy based data anonymization to improve users community privacy and utility in publishing data. Applied Intelligence. 2020 Aug;50(8):2555-74.
- Senosi A, Sibiya G. Classification and evaluation of privacy preserving data mining: a review. 2017 IEEE AFRICON. 2017 Sep 18:849-55.
- Buratović I, Miličević M, Žubrinić K. Effects of data anonymization on the data mining results. In2012 proceedings of the 35th international convention MIPRO 2012 May 21 (pp. 1619-1623). IEEE.
- de Oliveira Silva H, Basso T, de Oliveira Moraes RL. Privacy and data mining: Evaluating the impact of data anonymization on classification algorithms. In2017 13th European Dependable Computing Conference (EDCC) 2017 Sep 4 (pp. 111-116). IEEE.
- Prasser F, Kohlmayer F, Kuhn KA. A benchmark of globally-optimal anonymization methods for biomedical data. In2014 IEEE 27th international symposium on computer-based medical systems 2014 May 27 (pp. 66-71). IEEE.
- Saranya K, Premalatha K, Rajasekar SS. A survey on privacy preserving data mining. In2015 2nd International Conference on Electronics and Communication Systems (ICECS) 2015 Feb 26 (pp. 1740-1744). IEEE.
- Bild R, Kuhn KA, Prasser F. Safepub: A truthful data anonymization algorithm with strong privacy guarantees. Proceedings on privacy enhancing technologies. 2018.
- El Emam K, Dankar FK. Protecting privacy using k-anonymity. Journal of the American Medical Informatics Association. 2008 Sep 1;15(5):627-37.
- Ho TK. The random subspace method for constructing decision forests. IEEE transactions on pattern analysis and machine intelligence. 1998 Aug 31;20(8):832-44.
- Breiman L. Bagging predictors. Machine learning. 1996 Aug;24(2):123-40.
- Benbouzid D, Busa-Fekete R, Casagrande N, Collin FD, Kégl B. MultiBoost: a multi-purpose boosting package. The Journal of Machine Learning Research. 2012 Mar 1;13(1):549-53.
- Wyner AJ, Olson M, Bleich J, Mease D. Explaining the success of adaboost and random forests as interpolating classifiers. Journal of Machine Learning Research. 2017;18(48):1-33.
- Rodriguez JJ, Kuncheva LI, Alonso CJ. Rotation forest: A new classifier ensemble method. IEEE transactions on pattern analysis and machine intelligence. 2006 Oct 31;28(10):1619-30.
- Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J. Scikit-learn: Machine learning in Python. the Journal of machine Learning research. 2011 Nov 1;12:2825-30.
- Hunter JD. Matplotlib: A 2D graphics environment. Computing in science & engineering. 2007 May 1;9(03):90-5.
- Dankar FK, El Emam K, Neisa A, Roffey T. Estimating the re-identification risk of clinical data sets. BMC medical informatics and decision making. 2012 Jul 9;12(1):66.
- Prasser F, Kohlmayer F, Lautenschläger R, Kuhn KA. Arx-a comprehensive tool for anonymizing biomedical data. InAMIA Annual Symposium Proceedings 2014 Nov 14 (Vol. 2014, p. 984).
- Vanichayavisalsakul P, Piromsopa K. An evaluation of anonymized models and ensemble classifiers. InProceedings of the 2018 2nd international conference on big data and internet of things 2018 Oct 24 (pp. 18-22).
- El Emam K. Guide to the de-identification of personal health information. CRC Press; 2013 May 6. [28] Clifton C, Kantarcioglu M, Vaidya J. Defining privacy for data mining. InNational science foundation workshop on next generation data mining 2002 Nov 1 (Vol. 1, No. 26, p. 1).
- Kanade DM, Patil LA. Internet of Things Security: Challenges and Opportunities. International Journal for Research in Applied Science and Engineering Technology. 2018.
- Qian J, Li XY, Zhang C, Chen L, Jung T, Han J. Social network de-anonymization and privacy inference with knowledge graph model. IEEE Transactions on Dependable and Secure Computing. 2017 Apr 25;16(4):679-92.

Recent Trends in Parallel Computing
| Volume | 12 |
| 03 | |
| Received | 12/05/2025 |
| Accepted | 10/09/2025 |
| Published | 17/11/2025 |
| Publication Time | 189 Days |
Login
PlumX Metrics