[langsec-discuss] composability

travis+ml-langsec at subspacefield.org travis+ml-langsec at subspacefield.org
Thu Jan 21 03:38:30 UTC 2016


On Fri, Jan 15, 2016 at 10:10:09AM -0500, darren.highfill at us.pwc.com wrote:

> The point identified in this thread from which I haven't been able
  to get away is the question of what affect layering, modularity, and
  composition (called LMC here for convenience) has on the propagation
  of unintended design attributes. Does LMC have a net positive or
  negative affect on security?

I think a more important question is if there are security boundaries
between those layers.  How you define a security boundary, as I
mentioned in the "Turducken" thread, isn't exactly clear, nor is the
guidance on where you place them.  But I know a MMU-enforced address
space when I see it.

I am having trouble imagining the case where adding security
boundaries decreases security, except for the case Guthrey implied,
where a layer/module/system relies on another layer/module/system for
a security-relevant decision, and has (worst case
adversary-controlled) choices of same, in which case you get a
"weakest link" security = min-security.

> I started to get the mental image of LMC resembling a graphical
  portrayal of integral calculus. As the block size approaches the
  small and large ends of the extreme, the chart (or program) starts
  to look monolithic again.

Do you have an example of how a highly decomposed, modular program
looks monolithic?

I thought Farrow's examples of highly secure programs - presumably
postfix and qmail - were examples of extreme modularity - and
typically executed with least privilege, with process (security)
boundaries between them.

It's also worth noting that both were written predominantly in
reaction to a highly monolithic system, sendmail, with a
Turing-complete config file language based on ASCII representations of
modem line noise (or swearing).

But there's a notable other data point - exim - which also postdates
sendmail, is highly modular, but has a reputation (perhaps
deservedly?) for a lack of security.

What I suspect is that the decomposition into smaller pieces is most
beneficial to security when they reside in different security domains
- that is, when interactions between them are limited, and must cross
a security boundary - bonus for least-priv.

> If, in fact, LMC has some affect - positive or negative - this would
  imply a desired optimization of its use. Specifically, we would seek
  an optimal size of LMC - theoretically up to a point of diminishing
  returns where other factors start to overtake the equation.

Curious; what would those factors be?  Possibly the lack of
clearly-defined contracts between layers?  Guthrey alluded to
abstraction - we have abstracted the implementation, but not the
semantics.

I, for one, see the method name "add" and know everything I need to
know.  To add a comment/javadoc that it "adds two (classname)
together" is completely unnecessary.  But not every programmer is as
gifted as I am. ;-)

Or perhaps the factor is when software growth (and variety) in the
more powerful side of the security boundary negates the security in
the less powerful side of that security differential, as firmware and
UEFI/BIOS and VMM device emulation defeats OS-level protections:

http://www.intelsecurity.com/advanced-threat-research/content/AttackingHypervisorsViaFirmware_bhusa15_dc23.pdf

> Are we saying that this optimal size corresponds to how much code
  one person or another can hold in their head? Do we have any
  research that points to a correlation between bug density and module
  size? What does the module size to bug density curve look like?

I heard somewhere once that the chance of defects depends on the
number of (potential) interactions, which is O(n^2) to the number of
elements.  It also stands to reason that (implementation) bugs are
O(n) to the number of lines of code.  Also it's possible that it's not
related to the size of the source code at all, like timing side
channel attacks.  In many cases, like SSL certificate chain
validation, the simplest and most logical code is not correct:

https://www.blackhat.com/presentations/bh-dc-09/Marlinspike/BlackHat-DC-09-Marlinspike-Defeating-SSL.pdf

A possible data set along these lines, which ties back into the
aforementioned security boundary comment, is that the monolithic Linux
kernel has, for 802.11 support:
      30 KLOC of driver code
      80 KLOC of generic support
      1MB firmware to be loaded onto NIC
http://www.fixup.fi/misc/usenix-login-2015/login_oct15_02_kantee.pdf

In 2011 they had 15 MLOC:
http://arstechnica.com/business/2012/04/linux-kernel-in-2011-15-million-total-lines-of-code-and-microsoft-is-a-top-contributor/

And the tarball size is increasing exponentially:
http://ngc891.blogdns.net/?p=92

So, as this ring 0 software program experiences success and feature
creep, the number of interactions is growing, and they had an
interesting pattern in reported security vulnerabilities, neither
exponential, linear, nor strictly increasing:

https://www.cvedetails.com/product/47/Linux-Linux-Kernel.html?vendor_id=33

Explanations?

1) This is a sampling based on discovered vulns, and date of
   publication, not date software was written, so may tell us
   more about efforts to audit the code than writing it
2) Kernel programming is different than userland, and the
   discipline might tend towards linear rather than exponential
3) But who really knows, a C implementation flaw is usually
   pretty deadly no matter what.
4) Kernel devs tend to be abnormally good C programmers.

My point is that too much growth in the kernel invalidates security in
the down-security-gradient userland, and that when that pain gets bad
enough there will probably be an incentive to rewrite the layer in a
more modular fashion, with security boundaries such that your file
system driver cannot defeat SELinux (and Tanenbaum will be smiling).
The thing stopping it heretofore is inconvenience, legacy code rewrite
costs, and of course, the absolute killer, the performance of crossing
the security boundary (e.g. TLB switch).  But not only have the costs
been designed around, as in L4:
https://en.wikipedia.org/wiki/L4_microkernel_family

It has even been formally verified:
https://sel4.systems/Info/FAQ/#sel4

Final comment:

There are security bugs which cross security layers - like the
sendmail email pipe bug:
http://www.iss.net/security_center/reference/vulntemp/Email_Pipe.htm

Which floated through secure systems (and kernel) to hit a vulnerable
sendmail target.  That has more to do with piggybacking on existing
data flows (and associated filtering/transformations) than the
security layers themselves.  To Kaminsky's point, it never redefined
the attack surface.  To a certain extent, all Internet-delivered
attacks have already floated through many layers and immune systems,
but they were already "open" to communication.  Redefining the attack
surface - that is, breaking the (sometimes implicit) security
guarantee of a security boundary - usually involves an exploit, which
costs money. Adding boundaries increases attack cost.  OpenSSL is
actually one of the few programs which has an explicit security
policy:

https://www.openssl.org/docs/fips/SecurityPolicy-2.0.pdf
-- 
http://www.subspacefield.org/~travis/ | if spammer then john at subspacefield.org
"Computer crime, the glamor crime of the 1970s, will become in the
1980s one of the greatest sources of preventable business loss."
John M. Carroll, "Computer Security", first edition cover flap, 1977


More information about the langsec-discuss mailing list