XSS Zones

One of the impossible problems of the web is how do you protect against site that has a persistent XSS hole yet requires JavaScript to function. I thought about this for a while and worked out you could create a XSS zone where you expect user input. Declaring a zone is tricky because if you have a start and end marker the attacker can manipulate that with their markup and break out of the zone.

Lets take a look at some code as a reference:-

<!-- Begin XSS zone -->
Hello I am a twitter..I mean webgoat
<!-- End XSS zone -->

So we define a XSS zone where we allow untrusted input. The idea of the zone is that JavaScript, events, css is locked down or disabled stopping the worm or evil input from spreading or cause information disclosure. But now the attacker has a new string to inject <!– End XSS zone –><img src=1 onerror=alert(1)> and they have broken out of the zone.

The solution to this problem is to randomize the zone name:-

<!-- Begin XSS zone 9cb3c2fd7ef861d762471c90de049603806e315eea3daf
13e0b8faadd6b9e85db09afeab9430d8f58d1f7e8551745
ec3f3961be932b79247f56c12db5c7e2e8d -->
Evil content
<!-- End XSS zone 9cb3c2fd7ef861d762471c90de049603806e315eea3daf13e0b8f
aadd6b9e85db09afeab9430d8f58d1f7e8551745
ec3f3961be932b79247f56c12db5c7e2e8d -->

Before the HTML is rendered the browser looks for XSS zone name, when it finds the first zone name it continues parsing the HTML until the matched ending zone is found. Any existing zones inside are ignored. The randomization of the zone name is generated on every request and are removed from the markup before render.

XSS zones would require a inbuilt JavaScript/HTML/CSS sandbox which would only allow harmless markup. Any input that is accepted from the user would have to be declared with a XSS zone and the feature would have a OFF/ON switch somewhere maybe HTTP header. The zones themselves could have a simple configurable attributes e.g. javascript=false css=false html=true urls=sameorigin.

Defining a zone could be done manually by generating a random name and outputting the HTML comment or the server side language could detect where variables are used and output the zone names automatically. Using HTML comments is interesting because they act as both a HTML and JavaScript comment enabling nice fallback.

One final way to enable XSS zones would be using the browser itself similar how Firebug and IE developer toolbar allow you to select DIVs and other elements, the advanced user could select an area of a site that they determine requires a XSS zone. The browser would then monitor this section of the site and automatically add a random XSS zone to the markup.

Configuration of zones

Zones should always follow the format of <!– Begin XSS zone RANDOMKEY CONFIGURATION_DATA –> and end with <!– End XSS zone RANDOMKEY –>

The configuration should be simple and precise. The following commands should be supported.
javascript=no|yes|domains*
css=no|yes|domains*
urls=sameorigin|domains|proxied*

* Domain list should be a whitelist only, no global wildcards allowed.
* Proxied should place all urls through a proxied service that obtains the image data or follows a link without sending cookie information and pre checked with a malware scanner.

That was the basis of my idea, if you like it implement it.

8 Responses to “XSS Zones”

  1. Eli Grey writes:

    Random keys doesn’t solve the problem. A reasonable implementation of the standard would be live (in that it actively waits for “XSS zones” to be established in the DOM, either by the markup parser or JavaScript) use the DOM and not just search for comment node strings. This fails because comment nodes are still nodes, so you can just read their nodeValue property to get the random key using JavaScript.

  2. Gareth Heyes writes:

    @Eli

    The random keys are not for protecting access inside the dom but used as a means of extracting xss zone contents. Once the contents are extracted the comments are removed. In other words the HTML is parsed before it is rendered in the browser.

  3. Eitan Adler writes:

    This has actually been discussed before and your method won’t function well do some quirks of HTML processing.
    The solution the WHATWG came up is the concept of a “srcdoc” and “sandbox” attributes

    Some details: http://dev.w3.org/html5/spec-author-view/the-iframe-element.html
    Some discussion on the topic:
    http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2008-July/015349.html

    Overview of the idea: http://www.htmlfive.net/whats-next-in-html-episode-2-whos-been-peeing-in-my-sandbox/

  4. Eitan Adler writes:

    This has actually been discussed before.The method you describe won’t function well due to some quirks of HTML processing.

    The solution the WHATWG came up is the concept of a “srcdoc” and “sandbox” attributes

    Some details: http://dev.w3.org/html5/spec-author-view/the-iframe-element.html
    Some discussion on the topic:
    http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2008-July/015349.html

    Overview of the idea: http://www.htmlfive.net/whats-next-in-html-episode-2-whos-been-peeing-in-my-sandbox/

  5. kuza55 writes:

    I saw some dude from Mozilla talking about this ages ago, it was even on sla.ckers.org.

    One of the things that stuck in my mind was that there isn’t one safe subset of html that will fit all cases, e.g. form tags, yes or no? iframes, yes or no?

    A simpler solution IMO would be a system like the one Wisec made way back when that separates html and data, and uses JS to fill the data in using the DOM API; you can make whatever fine grained policies you like that way without needing to modify all the browsers in the world.

  6. AnonAcademic writes:

    This kind of approach has been proposed in the academic literature. For related schemes, take a look at Noncespaces (http://www.cs.ucdavis.edu/~hchen/paper/ndss09.pdf) and Document Structure Integrity (http://webblaze.cs.berkeley.edu/papers/2009/nadji-saxena-song.pdf).

    One challenge is that it requires modifications to both browsers and web sites.

  7. AnonAcademic writes:

    This kind of approach has been proposed in the academic literature. For related schemes, take a look at Noncespaces (Grundy, Chen) and Document Structure Integrity (Nadji, Saxena, Song). (I’m omitting the URLs in case they trigger moderation.)

    One challenge is that it requires modifications to both browsers and web sites.

  8. Gareth Heyes writes:

    @kuza55

    Yeah separation of content/data is the best way I was trying to figure out a usable solution for existing sites to prevent persistent xss.

    Sirdarckcat wrote ACS that allows you to whitelist urls and scripts client side using the plaintext element (which is quite clever). I remember Wisec’s one from ages ago too.

    Overall we need something in the middle that is easy to implement and provides sites with a means to protect elements on a page and users to define their own areas for sites that haven’t configured any.

    @AnonAcademic

    Random namespaces are a pretty cool idea although I think it’s far from practical as the namespace would have to be large enough to prevent a bruteforce injection and therefore add overhead to every html element/attribute increasing the size dramatically.