The JSON specification is now wrong
Monday, 25 July 2011
ES5 has decided for whatever reason to treat \u2028 and \u2029 (line/paragraph separators) as a new line in JavaScript this makes it in-line with regex “\s” character class. The JSON specification (to my knowledge) wasn’t changed. So although it mentions escaping characters within strings it isn’t a requirement. This means we’re left with \u2028 and \u2029 characters that can break entire JSON feeds since the string will contain a new line and the JavaScript parser will bail out.
Another interesting fact is that Crockford’s regex in the JSON specification is also wrong, correct at the time but now wrong =)
text='{"abc":"abc\u2029aa"}';
var my_JSON_object = !(/[^,:{}\[\]0-9.\-+Eaeflnr-u \n\r\t]/.test(
text.replace(/"(\\.|[^"\\])*"/g, ''))) &&
eval('(' + text + ')');
This will eval since the test doesn’t account for line/paragraph separators and will raise a syntax error since a new line is encountered.
This is also true of most native JSON parsers in various browsers, for example the following:
eval("("+JSON.stringify({a:'a\u2029a'})+")")
Will bail out because the paragraph separator isn’t escaped.
No. 1 — July 25th, 2011 at 10:59 pm
Did you really expect STANDARDS GROUPS to pay attention to annoying little things like “details.”
No. 2 — July 26th, 2011 at 6:11 am
Do you know if this is implemented anywhere? I suppose this is the unescaped form of \u2028 it mentions? So as long as it’s escaped e.g. it shouldn’t cause a break.
No. 3 — July 26th, 2011 at 6:12 am
Oh stupid thing took my escaping literally 🙂 re:
e.g.
No. 4 — July 26th, 2011 at 7:15 am
People shouldn’t be relying much on eval() for parsing JSON, so I’d suppose the more important question is how will the native JSON parser handle this.
No. 5 — July 26th, 2011 at 7:46 am
@radi
Eval was to demonstrate the issue, if you output a raw JSON feed with \2028 or \2029 then the JSON script will fail.
@chris
Yeah if the para/line sep are escaped with \u2028 or \u2029 then it should be ok
No. 6 — July 26th, 2011 at 8:51 am
Will it be possible to perform XSS using this, for example if a website uses AJAX/JSON to display data on the site?
No. 7 — July 26th, 2011 at 9:13 am
@Daniel
Nope that’s why I posted it, it’s a minor issue. You can only raise syntax errors
No. 8 — July 26th, 2011 at 9:43 am
@chris
Stupid wordpress is missing the character, entities shouldn’t work within a script block unless the page is served as XHTML
No. 9 — August 17th, 2011 at 6:39 pm
uhm??… that’s weird
however ….
var s = JSON.parse(‘”\u2028\u2029″‘);
alert(JSON.stringify(s).replace(/\u2028|\u2029/g, “what you gonna put here?”));
No. 10 — August 18th, 2011 at 9:35 am
@WebReflection
You do not put anything there, the intention is to bail out the page serving the JSON. So a server side JSON parser would likely leave 0x2028/x02029 and when the page is served a syntax error would occur since the string will be broken with a new line.
No. 11 — August 18th, 2011 at 2:44 pm
as written in JSONH github page:
output.replace(
/\u2028|\u2029/g,
function (m) {
return “\\u202” + (m === “\u2028” ? “8” : “9”);
})
should drop the problem, let me know if it doesn’t, thanks
No. 12 — August 18th, 2011 at 2:49 pm
@WebReflection
Looks good thanks