Web Security in the Digital Age: Artificial Intelligence Solution for Malicious Website Classification

Web Security in the Digital Age: Artificial Intelligence Solution for Malicious Website Classification

Sujatha Krishna (College of Computing and Information Sciences, University of Technology and Applied Sciences, Oman), Rajesh Natarajan (College of Computing and Information Sciences, University of Technology and Applied Sciences, Oman), Francesco Flammini (University of Applied Sciences and Arts of Southern Switzerland, Switzerland), Badria Sulaiman Alfurhood (College of Computer and Information Sciences, Princess Nourah Bint Abdulrahman University, Saudi Arabia), V. Janhavi (Vidyavardhaka College of Engineering, India), and Shashi Kant Gupta (Institute of Engineering and Technology, Chitkara University, India)
Copyright: © 2025 |Pages: 25
DOI: 10.4018/IJSWIS.369823
Article PDF Download
Open access articles are freely available for download

Abstract

The hazards associated with cyber security attacks have increased due to our increasing reliance on digital technologies. We suggest an ensemble strategy employing multiple models of machine learning to get over these restrictions. Our suggested approach detects an extra 141 dangerous URLs, outperforming the single model that is currently in use by 6%. We gathered a collection of malicious and benign URLs for this investigation. Applying Principal Component Analysis (PCA) to feature extraction. Subsequently advantageous features are selected and categorised using the Social Spider optimised Densely-connected Convolutional Caps net (SSO-DCN). The most effective detection frameworks and automated methods for identifying rogue web pages, including those distribute malicious malware, are included in the proposed technology. When comparing the proposed method to the existing approaches, the improvements are in mean accuracy (96.30%), precision (96.90%), F1-Measure (97.20%), recall (96.90%), FPR (3.50%) and FNR (3.90%).
Article Preview
Top

Introduction

As the world wide web is the foundation for global interaction, trade, and data sharing in the quickly changing digital age, having strong online security has become essential. The demand for sophisticated and adaptable security solutions grows as the frequency of online attacks rises (Narsimha et al., 2022). The detection and categorization of hazardous web pages that endanger consumers constitutes one of the major issues in the field. The growing interplay among consumer applications and vital communications in industries approximating communication, banking, education, and the military requires further focus in light of today's sophisticated information technology (IT; Schmitt, 2023). Attackers and cybercriminals have more avenues of access in several critical infrastructure sectors because of the virtualization of information assets. Numerous strategies are used in internet crimes, such as systems splitting, phishing, domain name system poisoning, online fraud, malicious program assaults, spam, scams, and blackmail. The offender frequently poses as a valid URL on a phishing site to attempt to access the attacked system (Alawadhi et al., 2022). Phishing, spam, drive-by-download, clickjacking, assaults that need plugins or scripts, and advertising are examples of hostile attacks online. Phishing attempts typically take the shape of false communications that seem to have originated from a reliable source (Utku & Can, 2022). In a continually evolving internet world, it is difficult to include all areas. Experts in cybersecurity have proposed using machine learning (ML), referred to as a classification model, for harmful URL identification to overcome the drawbacks of the URL blacklist technique (Ha et al., 2023). Discriminating characteristics or rules form the foundation of this approach. By identifying characteristics, this method enables the machine to learn how to differentiate between harmful and benign URLs (Zhang et al., 2022). ML relies a lot on discriminative rules or feature collection in this procedure to find helpful attributes that can discover harmful websites. In the present research, an ML model is ordinarily used to identify a feature that is urbanized based on the URL or site inside. The main center of attention of this investigation is phishing site detection (Saxena et al., 2022). Depending on the feature selection methods used, ML models perform differently. Drawing on the gathered dataset and feature analysis employed in earlier research, we suggest an enhanced attribute set to enhance the web detection performance (Aslam et al., 2022). Rather than detecting malicious websites, recognizing websites that use malicious code helps to anticipate cyber dangers in advance (Chakraborty et al., 2023). To address these challenges, recent studies have explored advanced techniques for enhancing URL classification. For instance, deep learning (DL) approaches, including neural networks, are increasingly employed to enhance the precision of harmful website detection (Zhang & Yan, 2024). Additionally, combining multiple ML models or using ensemble techniques has shown promise in reducing errors and increasing reliability. These methods focus on refining the identification process by leveraging new algorithms, which can adapt to evolving threats and offer more efficient solutions for real-time web security (Tran & Sovilj, 2024). This research's main goal is to improve online security by creating and assessing a sophisticated ML-based method for classifying dangerous URLs. This involves utilizing an ensemble strategy that integrates multiple ML models to accurately identify and classify hazardous websites, such as phishing, defacement, malware, and other malicious types. Research applies the social spider optimized densely connected convolutional caps net (SSO-DCN) model for improved detection performance. The ultimate objective is to reduce false positive rates (FPRs) while increasing the detection accuracy, precision, recall, and F1 score of malicious URL identification.

Complete Article List

Search this Journal:
Reset
Volume 21: 1 Issue (2025)
Volume 20: 1 Issue (2024)
Volume 19: 1 Issue (2023)
Volume 18: 4 Issues (2022): 2 Released, 2 Forthcoming
Volume 17: 4 Issues (2021)
Volume 16: 4 Issues (2020)
Volume 15: 4 Issues (2019)
Volume 14: 4 Issues (2018)
Volume 13: 4 Issues (2017)
Volume 12: 4 Issues (2016)
Volume 11: 4 Issues (2015)
Volume 10: 4 Issues (2014)
Volume 9: 4 Issues (2013)
Volume 8: 4 Issues (2012)
Volume 7: 4 Issues (2011)
Volume 6: 4 Issues (2010)
Volume 5: 4 Issues (2009)
Volume 4: 4 Issues (2008)
Volume 3: 4 Issues (2007)
Volume 2: 4 Issues (2006)
Volume 1: 4 Issues (2005)
View Complete Journal Contents Listing