[langsec-discuss] Basic results?

Andrew munin at mimisbrunnr.net
Tue May 27 16:37:15 UTC 2014


> is usually intractable, if not undecidable

This is not really true as written. It can be intractable, if not
undecidable, to answer a question like "what might this code do under
all possible inputs", but this is the purpose of abstract
interpretation and approximate static analysis. Sometimes you can
prove something about a program under all inputs! Sometimes that
program is even interesting.

When someone says "it's impossible to answer a question of what might
some code do with the powers it's given" I want to say "no it's quite
easy - it might do anything". You just said you wanted an answer, you
didn't want a PRECISE answer. So how much precision do you want? This
is where the design of analyses gets interesting.

AI techniques have had successes in doing things like proving the
absence of buffer overflow[1] and null pointer[2] errors in software.
Static analysis techniques have also had some success in identifying
programs that leak information[3]. A particular static analysis
technique, type checking, has even been applied to prove that an
implementation of TLS corresponds to a specification of TLS, using
refinement types[4].

1:
http://www.amazon.com/Value-Range-Analysis-Programs-Overflow-Vulnerabilities/dp/1848000162
2: http://www.astree.ens.fr/
3: https://www.ssrg.nicta.com.au/projects/TS/infoflow.pml
4: http://www.mitls.org/wsgi/home

On 05/27/2014 12:05 PM, Evan Sultanik wrote:
> On Mon, May 26, 2014 at 08:55:59PM -0700, Paul Burchard wrote:
>> What are known results mitigating the basic computer science
>> problem that it's normally expensive or impossible to answer the
>> fundamental language security question: what might some code do
>> with the powers it's given?
> 
> It depends on what question you're ultimately trying to answer.
> While answering the question,
> 
> "What might this code do *under all possible inputs* with the 
> powers it is given?"
> 
> is usually intractable, if not undecidable, it is important to 
> remember that solving the "dual" of the problem is often much
> easier: "Does there exist an input that causes this code to
> [perform a specific behavior]?"  In other words, it is often much
> easier to find *at least one* certificate (i.e., input) that
> exercises a certain desired behavior (e.g., halting).  If there are
> only a limited number of behaviors about which one is concerned,
> this approach can often be tractable, if not even efficient.
> 
> I may be side-stepping your actual question here, but my point is
> that it can often be fruitful to constrain the set of inputs as
> opposed to constraining the language.
> 
> There are already many program analysis techniques [1,2] like
> concolic execution that provide an "oracle" for answering questions
> like, "Does there exist a set of inputs that causes [a specific
> pointer] to be used after it is freed?"  Constraint solvers [3,4]
> allow us to query for inputs that exercise almost arbitrarily
> complex behaviors, many of which are tractable to solve.  There has
> also been some recent success at using machine learning to classify
> the behavioral characteristics (as opposed to logical flow) of
> black-box binaries [5].  Liveness and deadlock can also be
> relatively efficiently determined using solely static analysis and
> model-checking [6,7]. <shameless-self-promotion>My employer,
> Digital Operatives [8], has developed a static analysis tool for
> detecting liveness and performing arbitrary constraint satisfaction
> and optimization on LLVM/IR.</shameless-self-promotion>
> 
> Now, back to your actual question.  It's clear that 
> non-Turing-complete languages can be useful.  For example, 
> Hofstadter's old BlooP [9] language can do useful computations
> like computing factorials, however, it explicitly disallows
> unbounded (infinite) loops, so the halting problem is decidable for
> it.  A somewhat more recent---but still old---example is the
> CHARITY programming language [10], which uses category theory [11]
> to guarantee termination.  Finally, you probably want to read the
> Hume Report [12] which describes a hierarchical system architecture
> that provides layers of non-Turing-completeness for security.
> 
> - Evan
> 
> [1] http://llvm.org/pubs/2008-12-OSDI-KLEE.html [2]
> http://s2e.epfl.ch/ [3]
> http://chicory.stanford.edu/PAPERS/STP-ganesh-07.pdf [4]
> http://z3.codeplex.com/ [5]
> http://www.blackhat.com/us-13/briefings.html#Saxe [6]
> http://www.cs.utexas.edu/~shmat/shmat_csf09.pdf [7]
> http://research.microsoft.com/en-us/um/people/lamport/tla/book-02-08-08.pdf
>
> 
[8] http://www.digitaloperatives.com/
> [9] http://en.wikipedia.org/wiki/BlooP_and_FlooP [10]
> http://pll.cpsc.ucalgary.ca/charity1/www/home.html [11]
> http://en.wikipedia.org/wiki/Category_theory [12]
> http://www-fp.cs.st-andrews.ac.uk/hume/report/hume-report.pdf
> 
> _______________________ Evan A. Sultanik, Ph.D. Chief Scientist 
> Digital Operatives, LLC Mobile: +1 215 919 7234
> 
> 
> 
> _______________________________________________ langsec-discuss
> mailing list langsec-discuss at mail.langsec.org 
> https://mail.langsec.org/cgi-bin/mailman/listinfo/langsec-discuss
> 


More information about the langsec-discuss mailing list