[langsec-discuss] stackoverflow's HTML sanitizer bypassed

travis+ml-langsec at subspacefield.org travis+ml-langsec at subspacefield.org
Mon Feb 23 03:27:22 UTC 2015

This post is part of a series
<http://danlec.com/blog/hacking-stackoverflow-com> describing the 33
security vulnerabilities I reported
tostackoverflow.com<http://stackoverflow.com/> from 2009-2013. This
particular exploit was reported and fixed in 2009.


I think the flaw here is that HTML is a Chomsky Type 2 grammar
(context free
grammar)<http://en.wikipedia.org/wiki/Context-free_grammar> and RegEx
is a Chomsky Type 3 grammar (regular
grammar)<http://en.wikipedia.org/wiki/Regular_grammar>. Since a Type 2
grammar is fundamentally more complex than a Type 3 grammar (see the
Chomsky hierarchy<http://en.wikipedia.org/wiki/Chomsky_hierarchy>),
you can't possibly make this work. But many will try, some will claim
success and others will find the fault and totally mess you up.
Split a packed field and I am there; parse a line of text and you will find me.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 834 bytes
Desc: not available
URL: <https://mail.langsec.org/pipermail/langsec-discuss/attachments/20150222/6f2e8ddd/attachment.sig>

More information about the langsec-discuss mailing list