Article Preview
TopIntroduction
As the world wide web is the foundation for global interaction, trade, and data sharing in the quickly changing digital age, having strong online security has become essential. The demand for sophisticated and adaptable security solutions grows as the frequency of online attacks rises (Narsimha et al., 2022). The detection and categorization of hazardous web pages that endanger consumers constitutes one of the major issues in the field. The growing interplay among consumer applications and vital communications in industries approximating communication, banking, education, and the military requires further focus in light of today's sophisticated information technology (IT; Schmitt, 2023). Attackers and cybercriminals have more avenues of access in several critical infrastructure sectors because of the virtualization of information assets. Numerous strategies are used in internet crimes, such as systems splitting, phishing, domain name system poisoning, online fraud, malicious program assaults, spam, scams, and blackmail. The offender frequently poses as a valid URL on a phishing site to attempt to access the attacked system (Alawadhi et al., 2022). Phishing, spam, drive-by-download, clickjacking, assaults that need plugins or scripts, and advertising are examples of hostile attacks online. Phishing attempts typically take the shape of false communications that seem to have originated from a reliable source (Utku & Can, 2022). In a continually evolving internet world, it is difficult to include all areas. Experts in cybersecurity have proposed using machine learning (ML), referred to as a classification model, for harmful URL identification to overcome the drawbacks of the URL blacklist technique (Ha et al., 2023). Discriminating characteristics or rules form the foundation of this approach. By identifying characteristics, this method enables the machine to learn how to differentiate between harmful and benign URLs (Zhang et al., 2022). ML relies a lot on discriminative rules or feature collection in this procedure to find helpful attributes that can discover harmful websites. In the present research, an ML model is ordinarily used to identify a feature that is urbanized based on the URL or site inside. The main center of attention of this investigation is phishing site detection (Saxena et al., 2022). Depending on the feature selection methods used, ML models perform differently. Drawing on the gathered dataset and feature analysis employed in earlier research, we suggest an enhanced attribute set to enhance the web detection performance (Aslam et al., 2022). Rather than detecting malicious websites, recognizing websites that use malicious code helps to anticipate cyber dangers in advance (Chakraborty et al., 2023). To address these challenges, recent studies have explored advanced techniques for enhancing URL classification. For instance, deep learning (DL) approaches, including neural networks, are increasingly employed to enhance the precision of harmful website detection (Zhang & Yan, 2024). Additionally, combining multiple ML models or using ensemble techniques has shown promise in reducing errors and increasing reliability. These methods focus on refining the identification process by leveraging new algorithms, which can adapt to evolving threats and offer more efficient solutions for real-time web security (Tran & Sovilj, 2024). This research's main goal is to improve online security by creating and assessing a sophisticated ML-based method for classifying dangerous URLs. This involves utilizing an ensemble strategy that integrates multiple ML models to accurately identify and classify hazardous websites, such as phishing, defacement, malware, and other malicious types. Research applies the social spider optimized densely connected convolutional caps net (SSO-DCN) model for improved detection performance. The ultimate objective is to reduce false positive rates (FPRs) while increasing the detection accuracy, precision, recall, and F1 score of malicious URL identification.