Regex HTML Sanitisation can work

Dear Pádraic Brady,

I have not received any emails with any exploits, I am disappointed I want my HTML regex sanitiser to be broken please. Apparently you can find 2-5 vulnerabilities per solution so please execute XSS in my regex. Thanks! I’ll be very impressed if you do and I will promise to dedicate a blog post to you.

HTML Regex sandbox

Please don’t stop there though :) I have a JavaScript sandbox that you can bypass that uses regular expressions.
JavaScript Regex sandbox

Thanks very much

Kind Regards
Gareth

10 Responses to “Regex HTML Sanitisation can work”

  1. Paul Stone writes:

    ‘expression’ seems to be allowed inside CSS styles.. is that intended?

  2. Rob Zienert writes:

    The gauntlet: it has been thrown.

  3. Gareth Heyes writes:

    @PaulStone

    Hi please provide a vector :)

  4. Paul Stone writes:

    xxx

    works in IE8 in quirks/IE7 mode.

  5. Paul Stone writes:

    <html&g;<div style=”x:expression(this.x?0:alert(this.x=’xss’))”>xxx</div>

  6. Paul Stone writes:

    Well, you get the idea (preview for comments would be nice :). Plain expression(alert(‘xss’)) would also work but the above version doesn’t DOS the browser with endless alerts.

  7. Gareth Heyes writes:

    @Paul

    Hey paul I tried in IE8 using the vector your describe and the opening tag is removed so the output is xxx</div> I’ll check my VM in IE7

  8. Paul Stone writes:

    Ah, it appears to be a bug in your code when running HTMLReg/CSSReg on Firefox 4 RC. The HTML gets sanitised when running on 3.6 or another browser.

  9. Gareth Heyes writes:

    @Paul

    Tried on FF 4 RC and the output is : xxx</div> please could you provide a screenshot and a pastebin of the code? Thanks!

  10. Padraic Brady writes:

    Gareth,

    I picked up your gauntlet, looked under it, and found something naughty. Could you supply an email address since I’m not big on making things public? Mine is ****

    Also, your reply failed to clarify that HTMLReg is a Javascript library that parses input using the browser DOM (which is not a JS regular expression driven process). Comparing Oranges to Apples doesn’t produce a good argument. Granted, while sipping on my Guinness last night I could have put PHP in a larger bold font throughout my article ;).