Evaluation of Ensemble and Deep Learning Classifiers on CSE-CIC- IDS2018 Dataset for Intelligent NIDS

[{“box”:0,”content”:”

n

n

 > 

n

n

 > 

n

n

n

nn.class=”stm_art_dvt{npadding: 0px 4px 4px 4px;n margin-right: 4px;n border-radius: 4px;n background-color: blanchedalmond;n}n@media only screen and (max-width: 600px){n.frm_grid_container{display:block !important;}n.frm3, .frm9{width:100%;}n.publication_main{border:none !important;}n}n.frm3{nbackground-color:white;n}nn

n

n

By [foreach 286]u00a0

u00a0Kaushik Datta,

[/foreach]
nApril 26, 2023 at 12:39 pm

n

nAbstract

n

Network Intrusion Detection System (NIDS) plays an active role in preventing cyber attacks by early detection of threats before it really starts affecting targeted information services. Over the years many intrusion detection system (IDS) have been developed applying signature or rule-based approach to prevent unauthorised access of network or computer devices. However, ever growing landscape of cyber attacks in recent years has motivated present day researchers to design and develop more accurate IDS using modern Machine Learning (ML) methods which identify attacks through anomaly detection. Development of intelligent NIDS highly depends on a rich, up-to-date and contemporary dataset which consists of relevant attributes and real-world scenario of cyber attacks. Varity of datasets are available for this purpose among which KDDCUP99, NSLKDD, ISCX2012, CICIDS2017, CICIDS2018, Kyoto etc. are the most popular ones and widely used. This paper reports our observations on the performance of two well known classifiers among Ensemble Learning methods, namely Random Forest and XGBoost and of Deep Neural Network classifier on the CSE-CIC-IDS2018 dataset which is relatively a new one and covers many contemporary cyber attacks. Their performances are evaluated using multiple metrics including Precision-Recall curve which has been proved to be more useful in case of imbalanced dataset like CSE-CIC-IDS2018.

n

n

n

n

Volume :u00a0u00a013 | Issue :u00a0u00a01 | Received :u00a0u00a0February 28, 2023 | Accepted :u00a0u00a0April 6, 2023 | Published :u00a0u00a0April 26, 2023n[if 424 equals=”Regular Issue”][This article belongs to Current Trends in Information Technology(ctit)] [/if 424][if 424 equals=”Special Issue”][This article belongs to Special Issue Evaluation of Ensemble and Deep Learning Classifiers on CSE-CIC- IDS2018 Dataset for Intelligent NIDS under section in Current Trends in Information Technology(ctit)] [/if 424]
Keywords Network Intrusion Detection System, CSE-CIC-IDS2018 dataset, Ensemble Learning, Multilayer Perceptron, Random Forest, XGBoost, Deep Neural Network.

n

n

n

n

n


n[if 992 equals=”Transformative”]

n

n

Full Text

n

n

n

[/if 992][if 992 not_equal=”Transformative”]

n

n

Full Text

n

n

n

[/if 992] n


nn

[if 379 not_equal=””]n

[foreach 379]n

n[/foreach]

n[/if 379]

n

References

n[if 1104 equals=””]n

  1. IDS 2018 | Datasets | Research | Canadian Institute for Cybersecurity | UNB. 2018. Available from: https://www.unb.ca/cic/datasets/ids-2018.html
  2. Ring, Markus & Wunderlich, Sarah & Scheuring, Deniz & Landes, Dieter & Hotho, Andreas. (2019). “A Survey of Network-based Intrusion Detection Data Sets.” Computers & Security. 86. 10.1016/j.cose.2019.06.005.
  3. Fitni, Q. R. S., & Ramli, K. (2020). “Implementation of ensemble learning and feature selection for performance improvements in anomaly-based intrusion detection systems.” In Proceedings – 2020 IEEE International Conference on Industry 4.0, Artificial Intelligence, and Communications Technology, IAICT 2020 (pp. 118-124).
  4. G. Karatas, O. Demir and O. K. Sahingoz, “Increasing the Performance of Machine Learning-Based IDSs on an Imbalanced and Up-to-Date Dataset,” in IEEE Access, vol. 8, pp. 32150-32162, 2020, doi: 10.1109/ACCESS.2020.2973219.
  5. Chawla, Nitesh & Bowyer, Kevin & Hall, Lawrence & Kegelmeyer, W.. (2002). “SMOTE: Synthetic Minority Over-sampling Technique.” J. Artif. Intell. Res. (JAIR). 16. 321-357. 10.1613/jair.953.
  6. Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., & Liu, T. (2017). “LightGBM: A Highly Efficient Gradient Boosting Decision Tree.” NIPS.
  7. Y. Hua, “An Efficient Traffic Classification Scheme Using Embedded Feature Selection and LightGBM,” 2020 Information Communication Technologies Conference (ICTC), Nanjing, China, 2020, pp. 125-130, doi: 10.1109/ICTC49638.2020.9123302.
  8. Rumelhart, D., Hinton, G. & Williams, R. “Learning representations by back-propagating errors.” Nature 323, 533–536 (1986).
  9. Basnet, Ram & Shash, Riad & Johnson, Clayton & Walgren, Lucas & Doleck, Tenzin. (2019). “Towards Detecting and Classifying Network Intrusion Traffic Using Deep Learning Frameworks.” 10.22667/JISIS.2019.11.30.001.
  10. Filho, Francisco & Silveira, Frederico & Junior, Agostinho & Vargas-Solar, Genoveva & Silveira, Luiz. (2019). “Smart Detection: An Online Approach for DoS/DDoS Attack Detection Using Machine Learning.” Security and Communication Networks. 2019. 1-15. 10.1155/2019/1574749.
  11. LeCun Y., Haffner P., Bottou L., Bengio Y. (1999) “Object Recognition with Gradient-Based Learning. In: Shape, Contour and Grouping in Computer Vision.” Lecture Notes in Computer Science, vol 1681. Springer, Berlin, Heidelberg
  12. Kim, Jiyeon & Kim, Jiwon & Kim, Hyunjung & Shim, Minsun & Choi, Eunjung. (2020). “CNN-Based Network Intrusion Detection against Denial-of-Service Attacks.” Electronics. 9. 916. 10.3390/electronics9060916.
  13. Kanimozhi, V. & Jacob, Prem. (2019). “Artificial Intelligence based Network Intrusion Detection with Hyper-Parameter Optimization Tuning on the Realistic Cyber Dataset CSE-CIC-IDS2018 using Cloud Computing.” 0033-0036. 10.1109/ICCSP.2019.8698029.
  14. R. Sommer and V. Paxson, “Outside the Closed World: On Using Machine Learning for Network Intrusion Detection,” 2010 IEEE Symposium on Security and Privacy, 2010, pp. 305-316, doi: 10.1109/SP.2010.25.
  15. Applications | Research | Canadian Institute for Cybersecurity | UNB. 2017. Available from: https://www.unb.ca/cic/research/applications.html
  16. Dietterich T.G. (2000) “Ensemble Methods in Machine Learning. In: Multiple Classifier Systems.” MCS 2000. Lecture Notes in Computer Science, vol 1857. Springer, Berlin, Heidelberg.
  17. Bühlmann, Peter. (2012). Bagging, Boosting and Ensemble Methods. Handbook of Computational Statistics. 10.1007/978-3-642-21551-3_33.
  18. Tin Kam Ho, “Random decision forests,” Proceedings of 3rd International Conference on Document Analysis and Recognition, 1995, pp. 278-282 vol.1, doi: 10.1109/ICDAR.1995.598994.
  19. Chen, T.Q. and Guestrin, C. (2016) XGBoost: A Scalable Tree Boosting System. arXiv:1603.02754v3.
  20. LeCun, Yann & Bengio, Y. & Hinton, Geoffrey. (2015). Deep Learning. Nature. 521. 436-44. 10.1038/nature14539.
  21. Grandini M, Bagli E, Visani G. Metrics for multi-class classification: an overview. arXiv preprint arXiv:2008.05756. 2020 Aug 13.
  22. Rashmi, K. and Ran Gilad-Bachrach. “DART: Dropouts meet Multiple Additive Regression Trees.” ArXiv abs/1505.01866 (2015): n. pag.

nn[/if 1104] [if 1104 not_equal=””]n

    [foreach 1102]n t

  1. [if 1106 equals=””], [/if 1106][if 1106 not_equal=””], [/if 1106]
  2. n[/foreach]

n[/if 1104]

n[if 1114 equals=”Yes”]n

n[/if 1114] n.citation_author{nfont-size:18px;}n.no_br br{ndisplay:none;n}n.frm_grid_containern{grid-gap: 0 0%; grid-template-columns: repeat(12, 8.33%);}n@media only screen and (max-width: 600px){n.frm_grid_container{display:block !important;}n.frm3, .frm9{width:100%;}n}n.h3{nfont-weight:bold; display:flex;n}n.Slide{n display:none;n display: block;n text-align: center;n }n .flowpaper-logo-bg{n visibility: hidden;n } n .center {n display: block;n margin:auto;nn n }n .slideContainer {n width:50%n height: 250px;n position: relative;n margin: auto;n }n .prevBtn,n .nextBtn {n position: absolute;n top: 30%;n width: auto;n padding: 10px;n color: rgb(50, 0, 116);n font-weight: bolder;n font-size: 18px;n }n .nextBtn {n right: 0;n }n @media only screen and (max-width: 450px) {n .prevBtn,n .nextBtn,n .Caption {n font-size: 16px;n }n.prevBtn button{npadding: 2px;n}n.nextBtn button{npadding: 2px;n}n }nn n function myFunction2() {n var x = document.getElementById(“browsefigure”);n if (x.style.display === “block”) {n x.style.display = “none”;n }n else {n x.style.display = “Block”;n }n }n document.querySelector(“.prevBtn”).addEventListener(“click”, () => {n changeSlides(-1);n });n document.querySelector(“.nextBtn”).addEventListener(“click”, () => {n changeSlides(1);n });n var slideIndex = 1;n showSlides(slideIndex);n function changeSlides(n) {n showSlides((slideIndex += n));n }n function currentSlide(n) {n showSlides((slideIndex = n));n }n function showSlides(n) {n var i;n var slides = document.getElementsByClassName(“Slide”);n var dots = document.getElementsByClassName(“Navdot”);n if (n > slides.length) {n slideIndex = 1;n }n if (n (item.style.display = “none”));n Array.from(dots).forEach(n item => (item.className = item.className.replace(” selected”, “”))n );n slides[slideIndex – 1].style.display = “block”;n dots[slideIndex – 1].className += ” selected”;n }n

n

n

[if 424 not_equal=”Regular Issue”] Regular Issue[/if 424] Subscription Article

n

Current Trends in Information Technology Cover

Current Trends in Information Technology

ISSN: 2249-4707

Editors Overview

ctit maintains an Editorial Board of practicing researchers from around the world, to ensure manuscripts are handled by editors who are experts in the field of study.

n

“},{“box”:4,”content”:”nh2{font-size:16px !important; font-family: ‘Roboto’, Slab !important; line-height: 1.4em;}nh3{font-size:18px !important;font-family: ‘Roboto’, Slab !important;}nh4{font-family: ‘Roboto’, Slab !important;}na{color:blue; font-size:15px !important;font-family: ‘Roboto’, Slab !important;}nli, p{font-size: 15px !important; font-family: ‘Roboto’, Slab !important; text-align: justify;}n.authdiv img{max-width:17px; max-height:17px;}n.authdiv{display:flex; padding: 1px 2px;”}nnnfunction myFunction2() {nvar x = document.getElementById(“browsefigure”);nif (x.style.display === “block”) {nx.style.display = “none”;}nelse {x.style.display = “Block”;}}ndocument.querySelector(“.prevBtn”).addEventListener(“click”, () => {nchangeSlides(-1);});ndocument.querySelector(“.nextBtn”).addEventListener(“click”, () => {nchangeSlides(1);});nvar slideIndex = 1;nshowSlides(slideIndex);nfunction changeSlides(n) {nshowSlides((slideIndex += n));}nfunction currentSlide(n) {nshowSlides((slideIndex = n));}nfunction showSlides(n) {nvar i;nvar slides = document.getElementsByClassName(“Slide”);nvar dots = document.getElementsByClassName(“Navdot”);nif (n > slides.length) {slideIndex = 1;}nif (n (item.style.display = “none”));nArray.from(dots).forEach(nitem => (item.className = item.className.replace(” selected”, “”))n);nslides[slideIndex – 1].style.display = “block”;ndots[slideIndex – 1].className += ” selected”;n}”},{“box”:1,”content”:”

    By  [foreach 286]n

  1. n

    Kaushik Datta

    n

  2. [/foreach]

n

    [foreach 286] [if 1175 not_equal=””]n t

  1. ,Variable Energy Cyclotron Centre, Department of Atomic Energy, Government of India,,
  2. n[/if 1175][/foreach]

n

n

n

n

n

Abstract

nNetwork Intrusion Detection System (NIDS) plays an active role in preventing cyber attacks by early detection of threats before it really starts affecting targeted information services. Over the years many intrusion detection system (IDS) have been developed applying signature or rule-based approach to prevent unauthorised access of network or computer devices. However, ever growing landscape of cyber attacks in recent years has motivated present day researchers to design and develop more accurate IDS using modern Machine Learning (ML) methods which identify attacks through anomaly detection. Development of intelligent NIDS highly depends on a rich, up-to-date and contemporary dataset which consists of relevant attributes and real-world scenario of cyber attacks. Varity of datasets are available for this purpose among which KDDCUP99, NSLKDD, ISCX2012, CICIDS2017, CICIDS2018, Kyoto etc. are the most popular ones and widely used. This paper reports our observations on the performance of two well known classifiers among Ensemble Learning methods, namely Random Forest and XGBoost and of Deep Neural Network classifier on the CSE-CIC-IDS2018 dataset which is relatively a new one and covers many contemporary cyber attacks. Their performances are evaluated using multiple metrics including Precision-Recall curve which has been proved to be more useful in case of imbalanced dataset like CSE-CIC-IDS2018.n

n

n

Keywords: Network Intrusion Detection System, CSE-CIC-IDS2018 dataset, Ensemble Learning, Multilayer Perceptron, Random Forest, XGBoost, Deep Neural Network.

n[if 424 equals=”Regular Issue”][This article belongs to Current Trends in Information Technology(ctit)]

n[/if 424][if 424 equals=”Special Issue”][This article belongs to Special Issue under section in Current Trends in Information Technology(ctit)] [/if 424]

n

n

n


n[if 992 equals=”Subscription”]n

n

n

Full Text

n

n

nn[/if 992]n[if 992 not_equal=”Subscription”]n

n

Full Text

n

n

n

https://storage.googleapis.com/journals-stmjournals-com-wp-media-to-gcp-offload/2023/04/09e3df42-evaluation-of-ensemble-and-deep-learning-classifiers-on-cse-cic-ids2018-dataset-for-int.pdf

n


[/if 992]n[if 379 not_equal=””]

Browse Figures

n

n

[foreach 379]n

n[/foreach]

n

[/if 379]n

n

References

n[if 1104 equals=””]

  1. IDS 2018 | Datasets | Research | Canadian Institute for Cybersecurity | UNB. 2018. Available from: https://www.unb.ca/cic/datasets/ids-2018.html
  2. Ring, Markus & Wunderlich, Sarah & Scheuring, Deniz & Landes, Dieter & Hotho, Andreas. (2019). “A Survey of Network-based Intrusion Detection Data Sets.” Computers & Security. 86. 10.1016/j.cose.2019.06.005.
  3. Fitni, Q. R. S., & Ramli, K. (2020). “Implementation of ensemble learning and feature selection for performance improvements in anomaly-based intrusion detection systems.” In Proceedings – 2020 IEEE International Conference on Industry 4.0, Artificial Intelligence, and Communications Technology, IAICT 2020 (pp. 118-124).
  4. G. Karatas, O. Demir and O. K. Sahingoz, “Increasing the Performance of Machine Learning-Based IDSs on an Imbalanced and Up-to-Date Dataset,” in IEEE Access, vol. 8, pp. 32150-32162, 2020, doi: 10.1109/ACCESS.2020.2973219.
  5. Chawla, Nitesh & Bowyer, Kevin & Hall, Lawrence & Kegelmeyer, W.. (2002). “SMOTE: Synthetic Minority Over-sampling Technique.” J. Artif. Intell. Res. (JAIR). 16. 321-357. 10.1613/jair.953.
  6. Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., & Liu, T. (2017). “LightGBM: A Highly Efficient Gradient Boosting Decision Tree.” NIPS.
  7. Y. Hua, “An Efficient Traffic Classification Scheme Using Embedded Feature Selection and LightGBM,” 2020 Information Communication Technologies Conference (ICTC), Nanjing, China, 2020, pp. 125-130, doi: 10.1109/ICTC49638.2020.9123302.
  8. Rumelhart, D., Hinton, G. & Williams, R. “Learning representations by back-propagating errors.” Nature 323, 533–536 (1986).
  9. Basnet, Ram & Shash, Riad & Johnson, Clayton & Walgren, Lucas & Doleck, Tenzin. (2019). “Towards Detecting and Classifying Network Intrusion Traffic Using Deep Learning Frameworks.” 10.22667/JISIS.2019.11.30.001.
  10. Filho, Francisco & Silveira, Frederico & Junior, Agostinho & Vargas-Solar, Genoveva & Silveira, Luiz. (2019). “Smart Detection: An Online Approach for DoS/DDoS Attack Detection Using Machine Learning.” Security and Communication Networks. 2019. 1-15. 10.1155/2019/1574749.
  11. LeCun Y., Haffner P., Bottou L., Bengio Y. (1999) “Object Recognition with Gradient-Based Learning. In: Shape, Contour and Grouping in Computer Vision.” Lecture Notes in Computer Science, vol 1681. Springer, Berlin, Heidelberg
  12. Kim, Jiyeon & Kim, Jiwon & Kim, Hyunjung & Shim, Minsun & Choi, Eunjung. (2020). “CNN-Based Network Intrusion Detection against Denial-of-Service Attacks.” Electronics. 9. 916. 10.3390/electronics9060916.
  13. Kanimozhi, V. & Jacob, Prem. (2019). “Artificial Intelligence based Network Intrusion Detection with Hyper-Parameter Optimization Tuning on the Realistic Cyber Dataset CSE-CIC-IDS2018 using Cloud Computing.” 0033-0036. 10.1109/ICCSP.2019.8698029.
  14. R. Sommer and V. Paxson, “Outside the Closed World: On Using Machine Learning for Network Intrusion Detection,” 2010 IEEE Symposium on Security and Privacy, 2010, pp. 305-316, doi: 10.1109/SP.2010.25.
  15. Applications | Research | Canadian Institute for Cybersecurity | UNB. 2017. Available from: https://www.unb.ca/cic/research/applications.html
  16. Dietterich T.G. (2000) “Ensemble Methods in Machine Learning. In: Multiple Classifier Systems.” MCS 2000. Lecture Notes in Computer Science, vol 1857. Springer, Berlin, Heidelberg.
  17. Bühlmann, Peter. (2012). Bagging, Boosting and Ensemble Methods. Handbook of Computational Statistics. 10.1007/978-3-642-21551-3_33.
  18. Tin Kam Ho, “Random decision forests,” Proceedings of 3rd International Conference on Document Analysis and Recognition, 1995, pp. 278-282 vol.1, doi: 10.1109/ICDAR.1995.598994.
  19. Chen, T.Q. and Guestrin, C. (2016) XGBoost: A Scalable Tree Boosting System. arXiv:1603.02754v3.
  20. LeCun, Yann & Bengio, Y. & Hinton, Geoffrey. (2015). Deep Learning. Nature. 521. 436-44. 10.1038/nature14539.
  21. Grandini M, Bagli E, Visani G. Metrics for multi-class classification: an overview. arXiv preprint arXiv:2008.05756. 2020 Aug 13.
  22. Rashmi, K. and Ran Gilad-Bachrach. “DART: Dropouts meet Multiple Additive Regression Trees.” ArXiv abs/1505.01866 (2015): n. pag.

n[/if 1104][if 1104 not_equal=””]n

    [foreach 1102]n t

  1. [if 1106 equals=””], [/if 1106][if 1106 not_equal=””],[/if 1106]
  2. n[/foreach]

n[/if 1104]

n


n[if 1114 equals=”Yes”]n

n[/if 1114]”},{“box”:2,”content”:”

Regular Issue Subscription Article

n

n

n

n

n

Current Trends in Information Technology

n

[if 344 not_equal=””]ISSN: 2249-4707[/if 344]

n

n

n

n

n

n

n

n

n

n

n

n

n

n

n

n

n

Volume 13
Issue 1
Received February 28, 2023
Accepted April 6, 2023
Published April 26, 2023

n

n

n n.post-views{ntext-align: center;n}n .ALLreveiwers img,.ALLeditors img{n width: 50px;n height: 50px;n border-radius: 50px;n margin: 10px;n }n .ALLreveiwers,.ALLeditors{n border-bottom:1px solid black;n }n n

Editor

n

n [foreach 1188] n

[if 1193 equals=””][else][/if] 

n [/foreach]n

n


n

Reviewer

n

n [foreach 1176] n

[if 1181 equals=””][else][/if] 

n [/foreach]n

n


n n function myfun() {n x=document.getElementById(“editor”);n y=document.getElementById(“down”);n z=document.getElementById(“up”);n if(x.style.display==”none”){n x.style.display=”block”;n }n else {n x.style.display=”none”;n }n if(y.style.display==”none”){n y.style.display=”block”;n }n else {n y.style.display=”none”;n }n if(z.style.display==”none”){n z.style.display=”block”;n }n else {n z.style.display=”none”;n }n }n function myfun2() {n x=document.getElementById(“reviewer”);n y=document.getElementById(“down2”);n z=document.getElementById(“up2″);n if(x.style.display==”none”){n x.style.display=”block”;n }n else {n x.style.display=”none”;n }n if(y.style.display==”none”){n y.style.display=”block”;n }n else {n y.style.display=”none”;n }n if(z.style.display==”none”){n z.style.display=”block”;n }n else {n z.style.display=”none”;n }n }n nntable, tr, td{npadding: 10px;nborder: none;n}nn”},{“box”:6,”content”:”“}]