Implementing Machine Learning in Data Classification

Year : 2025 | Volume : 03 | Issue : 02 | Page : 15 22
    By

    Monalisa Hati,

  • Khurram Rashid,

  1. Assistant Professor, Department of Computer Science and Engineering, Amity School of Engineering and Technology, Amity University Mumbai, Maharashtra, India
  2. Student, Department of Computer Science and Engineering, Amity School of Engineering and Technology, Amity University Mumbai, Maharashtra, India

Abstract

Data classification forms an essential aspect of artificial intelligence (AI) and soft computing, helping a great deal in the transformation of raw data into knowledge that forms the basis of numerous applications, such as fraud detection, medical diagnostics, and natural language processing. This study discusses the challenges and the state of the art in data classification, as far as scalability, noise handling, and feature selection optimization are concerned. It gives a review of classical classification methods such as decision trees, the support vector machine (SVM), and a host of ensemble learning algorithms vis-a-vis modern deep learning architectures like convolutional neural networks (CNN) and recurrent neural networks (RNN). Consequently, soft computing methods, such as fuzzy logic and genetic algorithms, are reviewed to ascertain how they can enhance performance concerning noisy, incomplete, or high-dimensional data. This study describes how AI and soft computing can merge given hybrid models that combine neural networks with fuzzy systems hierarchy to improve classification accuracy and interpretability. The methodology begins with a description of, in particular, the most popular frameworks utilized for model development: TensorFlow, PyTorch, and MATLAB, along with hyperparameter tuning strategies such as grid search, random search, and Bayesian optimization. Evaluation metrics such as accuracy, precision, recall, F1 score, or AUC-ROC find their application in various use cases such as facial recognition or medical imaging and financial fraud detection to showcase the effect of the proposed techniques. From the results, hybrid methods performed better than the conventional model against noisy and complex datasets and actually impart extensiveness and adaptability to the models. The results of the case studies support improvements in terms of classification accuracy and robustness. In conclusion, future work involves automation of feature selection, exploring additional hybrid approaches, and addressing ethical issues such as fairness and transparency in classification systems.

Keywords: TensorFlow, F1 score, SVM, CNN, RNN

[This article belongs to International Journal of Data Structure Studies ]

How to cite this article:
Monalisa Hati, Khurram Rashid. Implementing Machine Learning in Data Classification. International Journal of Data Structure Studies. 2025; 03(02):15-22.
How to cite this URL:
Monalisa Hati, Khurram Rashid. Implementing Machine Learning in Data Classification. International Journal of Data Structure Studies. 2025; 03(02):15-22. Available from: https://journals.stmjournals.com/ijdss/article=2025/view=232908


References

  1. Bishop CM, Nasrabadi NM. Pattern recognition and machine learning. New York: springer; 2006 Aug 17.
  2. Goodfellow I, Bengio Y, Courville A, Bengio Y. Deep learning. Cambridge: MIT press; 2016 Nov 18.
  3. Dubois D, Ostasiewicz W, Prade H. Fuzzy sets: history and basic notions. In: Fundamentals of fuzzy sets. Boston, MA: Springer US; 2000 Jan 31; 21–124.
  4. Hastie T, Tibshirani R, Friedman J, Franklin J. The elements of statistical learning: data mining, inference and prediction. Math Intell. 2005 Jun 1; 27(2): 83–5.
  5. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015 May 28; 521(7553): 436–44.
  6. Learned-Miller E, Huang GB, RoyChowdhury A, Li H, Hua G. Labeled faces in the wild: A survey. In Advances in face detection and facial image analysis. Cham: Springer International Publishing; 2016 Apr 2; 189–248.
  7. Learned-Miller E, Huang GB, RoyChowdhury A, Li H, Hua G. Labeled faces in the wild: A survey. In Advances in face detection and facial image analysis. Cham: Springer International Publishing; 2016 Apr 2; 189–248.
  8. Mooney P. (2018). Chest X-Ray Images (Pneumonia). [Online]. Kaggle.com. Available from: https://www.kaggle.com/datasets/paultimothymooney/chest-xray-pneumonia
  9. Scikit-learn. (2025). scikit-learn: machine learning in Python — scikit-learn 1.7.2 documentation. [Online]. Available from: https://scikit-learn.org/stable/
  10. TensorFlow. (2023). Keras: The high-level API for TensorFlow. [Online]. Available from: https://www.tensorflow.org/guide/keras
  11. Tudoroiu RE, Zaheeruddin M, Tudoroiu N. MATLAB Implementation of an Adaptive Neuro-Fuzzy Modeling Approach Applied on Nonlinear Dynamic Systems-a Case Study. In 2018 IEEE Federated Conference on Computer Science and Information Systems (FedCSIS). 2018 Sep 9; 577–583.
  12. Cateni S, Colla V, Vannucci M. A method for resampling imbalanced datasets in binary classification tasks for real-world problems. Neurocomputing. 2014 Jul 5; 135: 32–41.
  13. Piatrenka I, Rusek M. Quantum variational multi-class classifier for the iris data set. In International Conference on Computational Science. Cham: Springer International Publishing; 2022 Jun 15; 247–260.
  14. Gujjar JP, Kumar HP, Chiplunkar NN. Image classification and prediction using transfer learning in colab notebook. Glob Transit Proc. 2021 Nov 1; 2(2): 382–5.

Regular Issue Subscription Original Research
Volume 03
Issue 02
Received 04/04/2025
Accepted 08/04/2025
Published 21/05/2025
Publication Time 47 Days


Login


My IP

PlumX Metrics