Detecting Phishing Websites Using Hybrid Methodologies

Year : 2024 | Volume :15 | Issue : 03 | Page : –

Gargi Deshpande,

Shreyas Katkar,

Tanvi Kangane,

Ayush Kumar Giri,

Diksha Kale,

  1. Student, Department of Computer Engineering, Pillai College of Engineering, Dr. K. M. Vasudevan Pillai Campus, Plot No. 10, Sector 16, New Panvel East, Panvel, Navi Mumbai, Maharashtra, India
  2. Student, Department of Computer Engineering, Pillai College of Engineering, Dr. K. M. Vasudevan Pillai Campus, Plot No. 10, Sector 16, New Panvel East, Panvel, Navi Mumbai, Maharashtra, India
  3. Student, Department of Computer Engineering, Pillai College of Engineering, Dr. K. M. Vasudevan Pillai Campus, Plot No. 10, Sector 16, New Panvel East, Panvel, Navi Mumbai, Maharashtra, India
  4. Student, Department of Computer Engineering, Pillai College of Engineering, Dr. K. M. Vasudevan Pillai Campus, Plot No. 10, Sector 16, New Panvel East, Panvel, Navi Mumbai, Maharashtra, India
  5. Assistant Professor, Department of Computer Engineering, Pillai College of Engineering, Dr. K. M. Vasudevan Pillai Campus, Plot No. 10, Sector 16, New Panvel East, Panvel, Navi Mumbai, Maharashtra, India



In the digital era, personal information theft has become a widespread and increasingly severe crime. Cybercriminals, often known as hackers, use deceptive strategies, with phishing websites being a major method for stealing confidential data. These fake websites imitate legitimate ones, tricking users into revealing sensitive personal and financial information, which has led to a rise in fraud cases. To address this escalating threat, a comprehensive research paper is proposed. This approach involves preprocessing datasets and analyzing attributes such as IP addresses, URL length, and web traffic statistics to differentiate phishing websites from genuine ones. Feature extraction is performed using deep learning methods, such as Convolutional Neural Networks (CNN) and Gated Recurrent Units (GRU). A hybrid model that integrates machine learning with Transformer and GRU components is used, demonstrating better performance than traditional methods like LSTM, Naïve-Bayes, and Support Vector Machines (SVM). The study aims to identify phishing URLs and determine the most effective machine learning method by comparing each algorithm’s accuracy, false positive rate, and false negative rate. The proposed method was evaluated using 156,422 malicious sites and 392,924 legitimate sites, and it was found to be more effective in detecting malicious URLs compared to more recent methods.

Keywords: Phishing, Legitimate, CNN-GRU, LightGBM, Length Normalization, Uniform Encoding, Embedding Layer, Softmax

