Opera parser monster eats unicode

Whilst writing my own parser I found weird things in Opera’s JavaScript parser. I was testing what the various browsers allowed with unicode escapes and it turns out Opera seems more lax than others. My discovery began with the following code:

try {eval("\\u0066\\u0061\\u006c\\u0073\\u0065");} catch(e) {alert(e);}

What do you expect the undefined variable to be? It’s a unicode encoded “false” hehe so we can have a variable called “false” if we use unicode escapes on Firefox but what about Opera? Well it’s actually looking for a variable called “false5”. Why? Because the JavaScript parser seems to be off by one when using eval with unicode escapes so it thinks the \u006 is actually \u0065 and thus the “5” is added onto the string.

Pretty cool, so what else can we do? Well, Opera seems a bit more lax than the other browsers when it comes to unicode escapes, for example this is perfectly legal:


Pretty nuts right? You can use an incorrect unicode escape and the backslash gets ignored. Another example:


And finally I leave you with this, you can make \u become uu when inside an eval statement:

window.__defineGetter__("uu",function() { alert(1) });eval("\\u");

3 Responses to “Opera parser monster eats unicode”

  1. Chris Weber writes:

    Gaz I don’t follow – If Opera thinks \\u0066\\u0061\\u006c\\u0073\\u0065 is really ‘false5’ – where does the ‘5’ come from? I see what you’re saying, but that would mean that Opera sees the last code point as \\u0065 + 5. I mean, it found the ‘e’ right, which is the \\u0065, but it sounds like you’re saying it found a \\u006 followed by an extra 5.

  2. Chris Weber writes:

    BTW I added a nice little to my comment and got a nice PHP error from spambot or something – try it out 🙂

  3. Gareth Heyes writes:

    Hey chris, yeah so what opera does parse the \u0065 but their decoder puts the parser back a character so the “5” comes from the previous unicode escape (I spend way too much time studying js 🙂 )