[langsec-discuss] LangSec Workshop at IEEE SPW 2014, Sun May 18, 2014

travis+ml-langsec at subspacefield.org travis+ml-langsec at subspacefield.org
Tue Nov 26 05:57:00 UTC 2013


On Fri, Nov 22, 2013 at 06:54:35PM +0100, Peter Bex wrote:
> Partly out of the same frustration you have with the security industry's
> focus on "today's hottest exploits",

It's understandable.  Emotional Level:
The threats are sexy David and Goliath stories.  Anyone can empathize
with the position of the anonymous Internet user.  Fewer people can
empathize with the people defending the systems.  No surprise
involved; nobody makes a movie about people who design systems that
work as expected.

That may not make you feel any better.  But, Comprehension Level:
To understand the cleverness of an exploit you just need to understand
how the software works.  To understand the general solution, you need
to understand everything the exploit requires, the general case, the
root cause, the solution, and how that interacts with how people write
code.  Any topic with lots of subtle complexity and sub-cases tends to
be rather hard to master.  However, you'll find this POV to be at
least half of the material at defense-oriented conferences like OWASP
AppsecUSA and Usenix Security.  Example:

http://videos.2012.appsecusa.org/

> I wrote an article a while ago
> which discusses the ways in which injection attacks can be prevented
> properly, and how to *pervasively* prevent such attacks in software:
> http://www.more-magic.net/posts/structurally-fixing-injection-bugs.html

Nice!  To reinforce this approach, here's some similar thoughts (minus
working solution) I posted (without having seen your article):

From: travis+ml-baha at subspacefield.org
To: baha at lists.bitrot.info
Subject: a fix for XSS
Message-ID: <20121019035950.GP19213 at subspacefield.org>

When htmlspecialchars is not enough:

http://php.vrana.cz/context-aware-html-escaping.php

Here is an example where the parser (web browser) has multiple
contexts, and what works to escape in one (e.g. HTML body) won't work
in another (javascript).

Behold the many contexts of HTML escaping:

https://www.owasp.org/index.php/XSS_%28Cross_Site_Scripting%29_Prevention_Cheat_Sheet
http://stackoverflow.com/questions/1911022/what-are-all-the-html-escaping-contexts

Now one obvious fix is, "use templates" and some sort of context-aware
escaping. a template renderer should be able to track the parser state
of most browsers accurately enough to do a better job than most
coders.  Here's a example of Perl templating:

http://sam.tregar.com/html_template.html

This has the disadvantage of creating a templating language. and we
don't need more languages.  Django, I'm looking at you.

However, a different method might be generating HTML, and then the
context has to be known as one generates HTML - such as this:

http://perldoc.perl.org/CGI.html
1.          print $q->header,              # create the HTTP header
2.          $q->start_html('hello world'), # start the HTML
3.          $q->h1('hello world'),         # level 1 header
4.          $q->end_html;

Which has helper functions like this:
1.    Code                           Generated HTML
2.    ----                           --------------
3.    h1()                           <h1>
4.    h1('some','contents');         <h1>some contents</h1>
5.    h1({-align=>left});            <h1 align="LEFT">
6.    h1({-align=>left},'contents'); <h1 align="LEFT">contents</h1>

The brilliant part of this is that you're simply nesting the values,
so to make a bold, italicized text, you'd do:

b(i("some text"))

But unfortunately, I think that this simply converts 'some text' to
something like "<i>some text</i>" which is passed to the bold
function.  So the bold function can't escape the special characters
without breaking nested tags.

But, if the italicize function returned an object of some kind that
was rendered into a string only at the very last minute, then one
could maintain a tree of objects, and one could automatically escape
any strings.  So b(i(x)) will return a bold object containing an
italic object containing something else, but if x = "<pre>" then the
end result will be rendered with HTML escaping; no PRE tag will
actually be generated/emitted.

Perhaps one could use your language's type system to enforce this;
that is, instead of building up and returning strings to send to the
client, it only takes a "HTML Object Tree" which you build up.  Any
attempt to include a string into the tree will automatically escape
it; this should prevent most, if not all, XSS attacks.

This also fixes several HTML problems, such as the (stupid but rare)
overlapping tags, unmatched tags, etc.

This approach also can enforce generation of syntactically-correct
HTML.
-- 
http://www.subspacefield.org/~travis/
Remediating... LIKE A BOSS
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 834 bytes
Desc: not available
URL: <https://mail.langsec.org/pipermail/langsec-discuss/attachments/20131125/f4ff4a78/attachment-0001.pgp>


More information about the langsec-discuss mailing list