The JSON specification is now wrong

ES5 has decided for whatever reason to treat \u2028 and \u2029 (line/paragraph separators) as a new line in JavaScript this makes it in-line with regex “\s” character class. The JSON specification (to my knowledge) wasn’t changed. So although it mentions escaping characters within strings it isn’t a requirement. This means we’re left with \u2028 and \u2029 characters that can break entire JSON feeds since the string will contain a new line and the JavaScript parser will bail out.

Another interesting fact is that Crockford’s regex in the JSON specification is also wrong, correct at the time but now wrong =)


text='{"abc":"abc\u2029aa"}';
var my_JSON_object = !(/[^,:{}\[\]0-9.\-+Eaeflnr-u \n\r\t]/.test(
text.replace(/"(\\.|[^"\\])*"/g, ''))) &&
eval('(' + text + ')');

This will eval since the test doesn’t account for line/paragraph separators and will raise a syntax error since a new line is encountered.

This is also true of most native JSON parsers in various browsers, for example the following:
eval("("+JSON.stringify({a:'a\u2029a'})+")")

Will bail out because the paragraph separator isn’t escaped.

12 Responses to “The JSON specification is now wrong”

  1. BrianMB writes:

    Did you really expect STANDARDS GROUPS to pay attention to annoying little things like “details.”

  2. Chris Weber writes:

    Do you know if this is implemented anywhere? I suppose this is the unescaped form of \u2028 it mentions? So as long as it’s escaped e.g. 
 it shouldn’t cause a break.

  3. Chris Weber writes:

    Oh stupid thing took my escaping literally :) re:
    e.g. 


  4. radi writes:

    People shouldn’t be relying much on eval() for parsing JSON, so I’d suppose the more important question is how will the native JSON parser handle this.

  5. Gareth Heyes writes:

    @radi

    Eval was to demonstrate the issue, if you output a raw JSON feed with \2028 or \2029 then the JSON script will fail.

    @chris

    Yeah if the para/line sep are escaped with \u2028 or \u2029 then it should be ok

  6. Daniel writes:

    Will it be possible to perform XSS using this, for example if a website uses AJAX/JSON to display data on the site?

  7. Gareth Heyes writes:

    @Daniel

    Nope that’s why I posted it, it’s a minor issue. You can only raise syntax errors

  8. Gareth Heyes writes:

    @chris

    Stupid wordpress is missing the character, entities shouldn’t work within a script block unless the page is served as XHTML

  9. WebReflection writes:

    uhm??… that’s weird

    however ….

    var s = JSON.parse(‘”\u2028\u2029″‘);
    alert(JSON.stringify(s).replace(/\u2028|\u2029/g, “what you gonna put here?”));

  10. Gareth Heyes writes:

    @WebReflection

    You do not put anything there, the intention is to bail out the page serving the JSON. So a server side JSON parser would likely leave 0×2028/x02029 and when the page is served a syntax error would occur since the string will be broken with a new line.

  11. WebReflection writes:

    as written in JSONH github page:

    output.replace(
    /\u2028|\u2029/g,
    function (m) {
    return “\\u202″ + (m === “\u2028″ ? “8″ : “9″);
    })

    should drop the problem, let me know if it doesn’t, thanks

  12. Gareth Heyes writes:

    @WebReflection

    Looks good thanks