[langsec-discuss] stackoverflow's HTML sanitizer bypassed

travis+ml-langsec at subspacefield.org travis+ml-langsec at subspacefield.org
Mon Feb 23 03:27:22 UTC 2015


Quote:
This post is part of a series
<http://danlec.com/blog/hacking-stackoverflow-com> describing the 33
security vulnerabilities I reported
tostackoverflow.com<http://stackoverflow.com/> from 2009-2013. This
particular exploit was reported and fixed in 2009.
http://danlec.com/blog/hacking-stackoverflow-com-s-html-sanitizer

Funny:
http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454

Accurate:
I think the flaw here is that HTML is a Chomsky Type 2 grammar
(context free
grammar)<http://en.wikipedia.org/wiki/Context-free_grammar> and RegEx
is a Chomsky Type 3 grammar (regular
grammar)<http://en.wikipedia.org/wiki/Regular_grammar>. Since a Type 2
grammar is fundamentally more complex than a Type 3 grammar (see the
Chomsky hierarchy<http://en.wikipedia.org/wiki/Chomsky_hierarchy>),
you can't possibly make this work. But many will try, some will claim
success and others will find the fault and totally mess you up.
-- 
http://www.subspacefield.org/~travis/
Split a packed field and I am there; parse a line of text and you will find me.






-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 834 bytes
Desc: not available
URL: <https://mail.langsec.org/pipermail/langsec-discuss/attachments/20150222/6f2e8ddd/attachment.sig>


More information about the langsec-discuss mailing list