Advantages

Disadvantages

The researchers [52] proposed a data mining and static analysis approach for eliminating XSS vulnerabilities. The approach seeks to discover and eliminate harmful links from the source code. Their technique outperforms the upgraded ngram model. Following a discussion of the subclasses of XSS attacks, the paper briefly addresses the risks and concerns posed by XSS.

This approach cannot adequately prevent XSS against mXSS and DOM-based cross-site scripting (XSS) attacks.

The authors [53] proposed combining the machine-learning technique of classifiers with an upgraded n-gram approach to protect the social networking platform from XSS attacks.

If characteristics and examples are insufficient, it is possible that malicious pages won’t be recognized, which will make the training effort for this strategy difficult.

The researchers [54] proposed a method for preventing cross-site scripting that utilizes ANN-Multilayer Perceptron in conjunction with dynamic feature extraction. When compared to other machine-learning algorithms, this strategy outperforms others.

For XSS assaults, it has not been tested on actual web applications that are used in the world today.

In [55] web page content can be distinguished from injected data using a technique proposed by the authors. This machine-learning-based approach is exclusive to banking websites. The model is trained using data from the DOM tree.

This approach takes more time since it involves removing features from the web page before sending it back to the server where it originated.

The researchers [56] Proposed a hybrid solution for preventing XSS in web applications. They claim that their method is the first of its kind since it blends a metaheuristic algorithm (the Genetic Algorithm) with a framework for machine learning. This combination distinguishes their methodology. They used a threat intelligence model and reinforcement learning in addition to GA and statistical inference to protect them from XSS attacks.

This strategy has not been put through any kind of proof-of-concept testing on real-world mission-critical web applications.

The authors [57] presented RLXSS, a method for detecting cross-site scripting attacks dependent on reinforcement learning, and uses both adversarial and retraining models. This method made use of XSS detection technologies like SafeDog and XSSChop in addition to DDQN (dueling deep Q networks), an escape technique, and a reward mechanism. The adversarial samples that were obtained from the adversarial model were included in the retraining model so that optimization could be performed on them.

This approach cannot work against mXSS attack that usually employs filter-safe payloads and mutate them into insecure payloads after filtration.

The authors [58] proposed a deep learning approach to the Cross-site scripting identification in which the original data is first decoded, and then the word2vec algorithm is used to acquire information regarding the qualities of XSS payloads. The input is then placed into a Model of the LSTM neural network. Cross-validation of the tenfold test is utilized in the last step of this analysis to see how well the proposed method compares to the ADTree and AdaBoost methods.

This approach is ineffective against DOM-based XSS attacks.

The authors [59] proposed a supervised machine learning method for detecting potentially hazardous links before they execute on the victim’s computer. Their solution makes use of a Linear Support Vector Machine classifier to detect blind XSS attacks and differentiate between the primary characteristics of reflected and stored XSS attacks. JavaScript events were run during the features extraction process, which attackers use to inject malicious payloads. For testing purposes, a linearly separable dataset was used. Mutillidae, a free vulnerable website, was used to mimic a blind XSS attack.

This approach is entirely limited to handling DOM-base and mXSS attacks.

The authors [60] proposed a model for the detection of XSS that makes use of a metaheuristic approach known as a Genetic Algorithm.

This approach has not been tested on real-world, mission-critical web applications.