Evaluation of Ensemble and Deep Learning Classifiers on CSE-CIC- IDS2018 Dataset for Intelligent NIDS











Kaushik Datta,

April 26, 2023




Network Intrusion Detection System (NIDS) plays an active role in preventing cyber attacks by early detection of threats before it really starts affecting targeted information services. Over the years many intrusion detection system (IDS) have been developed applying signature or rule-based approach to prevent unauthorised access of network or computer devices. However, ever growing landscape of cyber attacks in recent years has motivated present day researchers to design and develop more accurate IDS using modern Machine Learning (ML) methods which identify attacks through anomaly detection. Development of intelligent NIDS highly depends on a rich, up-to-date and contemporary dataset which consists of relevant attributes and real-world scenario of cyber attacks. Varity of datasets are available for this purpose among which KDDCUP99, NSLKDD, ISCX2012, CICIDS2017, CICIDS2018, Kyoto etc. are the most popular ones and widely used. This paper reports our observations on the performance of two well known classifiers among Ensemble Learning methods, namely Random Forest and XGBoost and of Deep Neural Network classifier on the CSE-CIC-IDS2018 dataset which is relatively a new one and covers many contemporary cyber attacks. Their performances are evaluated using multiple metrics including Precision-Recall curve which has been proved to be more useful in case of imbalanced dataset like CSE-CIC-IDS2018.





Volume : 13 | Issue : 1 | Received : February 28, 2023 | Accepted : April 6, 2023 | Published : April 26, 2023
Keywords Network Intrusion Detection System, CSE-CIC-IDS2018 dataset, Ensemble Learning, Multilayer Perceptron, Random Forest, XGBoost, Deep Neural Network.






n[if 1104 equals=””]n

  1. IDS 2018 | Datasets | Research | Canadian Institute for Cybersecurity | UNB. 2018. Available from: https://www.unb.ca/cic/datasets/ids-2018.html
  2. Ring, Markus & Wunderlich, Sarah & Scheuring, Deniz & Landes, Dieter & Hotho, Andreas. (2019). “A Survey of Network-based Intrusion Detection Data Sets.” Computers & Security. 86. 10.1016/j.cose.2019.06.005.
  3. Fitni, Q. R. S., & Ramli, K. (2020). “Implementation of ensemble learning and feature selection for performance improvements in anomaly-based intrusion detection systems.” In Proceedings – 2020 IEEE International Conference on Industry 4.0, Artificial Intelligence, and Communications Technology, IAICT 2020 (pp. 118-124).
  4. G. Karatas, O. Demir and O. K. Sahingoz, “Increasing the Performance of Machine Learning-Based IDSs on an Imbalanced and Up-to-Date Dataset,” in IEEE Access, vol. 8, pp. 32150-32162, 2020, doi: 10.1109/ACCESS.2020.2973219.
  5. Chawla, Nitesh & Bowyer, Kevin & Hall, Lawrence & Kegelmeyer, W.. (2002). “SMOTE: Synthetic Minority Over-sampling Technique.” J. Artif. Intell. Res. (JAIR). 16. 321-357. 10.1613/jair.953.
  6. Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., & Liu, T. (2017). “LightGBM: A Highly Efficient Gradient Boosting Decision Tree.” NIPS.
  7. Y. Hua, “An Efficient Traffic Classification Scheme Using Embedded Feature Selection and LightGBM,” 2020 Information Communication Technologies Conference (ICTC), Nanjing, China, 2020, pp. 125-130, doi: 10.1109/ICTC49638.2020.9123302.
  8. Rumelhart, D., Hinton, G. & Williams, R. “Learning representations by back-propagating errors.” Nature 323, 533–536 (1986).
  9. Basnet, Ram & Shash, Riad & Johnson, Clayton & Walgren, Lucas & Doleck, Tenzin. (2019). “Towards Detecting and Classifying Network Intrusion Traffic Using Deep Learning Frameworks.” 10.22667/JISIS.2019.11.30.001.
  10. Filho, Francisco & Silveira, Frederico & Junior, Agostinho & Vargas-Solar, Genoveva & Silveira, Luiz. (2019). “Smart Detection: An Online Approach for DoS/DDoS Attack Detection Using Machine Learning.” Security and Communication Networks. 2019. 1-15. 10.1155/2019/1574749.
  11. LeCun Y., Haffner P., Bottou L., Bengio Y. (1999) “Object Recognition with Gradient-Based Learning. In: Shape, Contour and Grouping in Computer Vision.” Lecture Notes in Computer Science, vol 1681. Springer, Berlin, Heidelberg
  12. Kim, Jiyeon & Kim, Jiwon & Kim, Hyunjung & Shim, Minsun & Choi, Eunjung. (2020). “CNN-Based Network Intrusion Detection against Denial-of-Service Attacks.” Electronics. 9. 916. 10.3390/electronics9060916.
  13. Kanimozhi, V. & Jacob, Prem. (2019). “Artificial Intelligence based Network Intrusion Detection with Hyper-Parameter Optimization Tuning on the Realistic Cyber Dataset CSE-CIC-IDS2018 using Cloud Computing.” 0033-0036. 10.1109/ICCSP.2019.8698029.
  14. R. Sommer and V. Paxson, “Outside the Closed World: On Using Machine Learning for Network Intrusion Detection,” 2010 IEEE Symposium on Security and Privacy, 2010, pp. 305-316, doi: 10.1109/SP.2010.25.
  15. Applications | Research | Canadian Institute for Cybersecurity | UNB. 2017. Available from: https://www.unb.ca/cic/research/applications.html
  16. Dietterich T.G. (2000) “Ensemble Methods in Machine Learning. In: Multiple Classifier Systems.” MCS 2000. Lecture Notes in Computer Science, vol 1857. Springer, Berlin, Heidelberg.
  17. Bühlmann, Peter. (2012). Bagging, Boosting and Ensemble Methods. Handbook of Computational Statistics. 10.1007/978-3-642-21551-3_33.
  18. Tin Kam Ho, “Random decision forests,” Proceedings of 3rd International Conference on Document Analysis and Recognition, 1995, pp. 278-282 vol.1, doi: 10.1109/ICDAR.1995.598994.
  19. Chen, T.Q. and Guestrin, C. (2016) XGBoost: A Scalable Tree Boosting System. arXiv:1603.02754v3.
  20. LeCun, Yann & Bengio, Y. & Hinton, Geoffrey. (2015). Deep Learning. Nature. 521. 436-44. 10.1038/nature14539.
  21. Grandini M, Bagli E, Visani G. Metrics for multi-class classification: an overview. arXiv preprint arXiv:2008.05756. 2020 Aug 13.
  22. Rashmi, K. and Ran Gilad-Bachrach. “DART: Dropouts meet Multiple Additive Regression Trees.” ArXiv abs/1505.01866 (2015): n. pag.

    Kaushik Datta


Network Intrusion Detection System (NIDS) plays an active role in preventing cyber attacks by early detection of threats before it really starts affecting targeted information services. Over the years many intrusion detection system (IDS) have been developed applying signature or rule-based approach to prevent unauthorised access of network or computer devices. However, ever growing landscape of cyber attacks in recent years has motivated present day researchers to design and develop more accurate IDS using modern Machine Learning (ML) methods which identify attacks through anomaly detection. Development of intelligent NIDS highly depends on a rich, up-to-date and contemporary dataset which consists of relevant attributes and real-world scenario of cyber attacks. Varity of datasets are available for this purpose among which KDDCUP99, NSLKDD, ISCX2012, CICIDS2017, CICIDS2018, Kyoto etc. are the most popular ones and widely used. This paper reports our observations on the performance of two well known classifiers among Ensemble Learning methods, namely Random Forest and XGBoost and of Deep Neural Network classifier on the CSE-CIC-IDS2018 dataset which is relatively a new one and covers many contemporary cyber attacks. Their performances are evaluated using multiple metrics including Precision-Recall curve which has been proved to be more useful in case of imbalanced dataset like CSE-CIC-IDS2018.



Keywords: Network Intrusion Detection System, CSE-CIC-IDS2018 dataset, Ensemble Learning, Multilayer Perceptron, Random Forest, XGBoost, Deep Neural Network.

Current Trends in Information Technology


ISSN: 2249-4707


















Volume 13
Issue 1
Received February 28, 2023
Accepted April 6, 2023
Published April 26, 2023



