[langsec-discuss] Basic results?

Paul Burchard paulburchard at gmail.com
Wed May 28 05:59:42 UTC 2014


Evan,
Thanks again for all the great references.  Here is my initial take on the
underlying theoretical foundations.

In terms of pure language restrictions, the basic one underlying these
approaches is the restriction to bounded recursion (primitive recursive
functions).  This seems like a useful compromise, which allows full
security analysis in time bounded by the recursion bounds at the cost of
reduced computational power.  This is the same type of restriction
mentioned by Peter from general finite state machines to acyclic ones,
which appears to still have substantial applications.

A different approach is to restrict the security questions that may be
asked in exchange for greater (but still not Turing complete) computational
power.  These approaches are characterized by clear security questions but
rather implicit models of computation.  The static analysis tools you
mention (for example to prevent recursion-based DoS) are pessimistic
predictors of certain types of bad behavior; they can be thought of as
defining a reduced computational model that eliminates some perfectly good
code (to continue the example, eliminating any code where tainted inputs
can affect recursion bounds, a sort of relativized bounded recursion).
Another approach of this type pointed out to me by Matt DeMoss called
"secure multi-execution" guarantees that a partially ordered set of
security levels are respected, at a cost of linear time in the number of
security levels (by executing the code once per security level); meanwhile,
only misbehaving code has its behavior altered while conforming code can be
fully Turing complete.

Finally, there is the approach of not restricting the computational power
but using heuristics for answering security questions, which may be
successful in many practical cases but could potentially be at least
exponentially expensive.  You can think of this as making the set of
security questions that can be feasibly answered rather implicit.  The
constraint solvers you mention fall into this type, for example
transforming the security question into a SAT instance (which could be
exponentially expensive).

Let me know if I have misunderstood anything.
 On Mon, May 26, 2014 at 08:55:59PM -0700, Paul Burchard wrote:
> What are known results mitigating the basic computer science problem that
> it's normally expensive or impossible to answer the fundamental language
> security question: what might some code do with the powers it's given?

It depends on what question you're ultimately trying to answer.  While
answering the question,

   "What might this code do *under all possible inputs* with the
    powers it is given?"

is usually intractable, if not undecidable, it is important to
remember that solving the "dual" of the problem is often much easier:
"Does there exist an input that causes this code to [perform a
specific behavior]?"  In other words, it is often much easier to find
*at least one* certificate (i.e., input) that exercises a certain
desired behavior (e.g., halting).  If there are only a limited number
of behaviors about which one is concerned, this approach can often be
tractable, if not even efficient.

I may be side-stepping your actual question here, but my point is that
it can often be fruitful to constrain the set of inputs as opposed to
constraining the language.

There are already many program analysis techniques [1,2] like concolic
execution that provide an "oracle" for answering questions like, "Does
there exist a set of inputs that causes [a specific pointer] to be
used after it is freed?"  Constraint solvers [3,4] allow us to query
for inputs that exercise almost arbitrarily complex behaviors, many of
which are tractable to solve.  There has also been some recent success
at using machine learning to classify the behavioral characteristics
(as opposed to logical flow) of black-box binaries [5].  Liveness and
deadlock can also be relatively efficiently determined using solely
static analysis and model-checking [6,7].
<shameless-self-promotion>My employer, Digital Operatives [8], has
developed a static analysis tool for detecting liveness and performing
arbitrary constraint satisfaction and optimization on
LLVM/IR.</shameless-self-promotion>

Now, back to your actual question.  It's clear that
non-Turing-complete languages can be useful.  For example,
Hofstadter's old BlooP [9] language can do useful computations like
computing factorials, however, it explicitly disallows unbounded
(infinite) loops, so the halting problem is decidable for it.  A
somewhat more recent---but still old---example is the CHARITY
programming language [10], which uses category theory [11] to
guarantee termination.  Finally, you probably want to read the Hume
Report [12] which describes a hierarchical system architecture that
provides layers of non-Turing-completeness for security.

    - Evan

  [1] http://llvm.org/pubs/2008-12-OSDI-KLEE.html
  [2] http://s2e.epfl.ch/
  [3] http://chicory.stanford.edu/PAPERS/STP-ganesh-07.pdf
  [4] http://z3.codeplex.com/
  [5] http://www.blackhat.com/us-13/briefings.html#Saxe
  [6] http://www.cs.utexas.edu/~shmat/shmat_csf09.pdf
  [7]
http://research.microsoft.com/en-us/um/people/lamport/tla/book-02-08-08.pdf
  [8] http://www.digitaloperatives.com/
  [9] http://en.wikipedia.org/wiki/BlooP_and_FlooP
 [10] http://pll.cpsc.ucalgary.ca/charity1/www/home.html
 [11] http://en.wikipedia.org/wiki/Category_theory
 [12] http://www-fp.cs.st-andrews.ac.uk/hume/report/hume-report.pdf

_______________________
Evan A. Sultanik, Ph.D.
Chief Scientist
Digital Operatives, LLC
Mobile: +1 215 919 7234
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.langsec.org/pipermail/langsec-discuss/attachments/20140527/6be283b1/attachment.html>


More information about the langsec-discuss mailing list