[langsec-discuss] Computational Power of Restricted Languages

Taylor Hornby havoc at defuse.ca
Tue Nov 11 21:31:46 UTC 2014


I'm doing a LangSec binge and noticing a frequent error in LangSec
material (papers, talks). I'm not sure if it's just an "abuse of
notation" or an actual error, so hopefully someone on this list can
explain what is actually meant. I'm new to LangSec, so please forgive me
if I'm missing something obvious.

Here's one example. Quoting the footnote on page 5 of "Security
Applications of Formal Language Theory"...

    "...HTML5+CSS3 (shown to be UNDECIDABLE by virtue of its ability to
    implement Rule 110)..."

The fact that HTML5+CSS3 can specify computation that is as powerful as
a Turing machine does not mean the language itself is undecidable or
even requires a Turing machine to decide.

For example, the Turing-complete language "brainfuck" is as powerful as
a Turing machine, but the language specifying a brainfuck program is

    Brainfuck -> OPS
    OP -> > | < | + | - | . | ,
    OPS -> OPS OP | OPS Loop
    Loop -> '[' OPS ']'

Similarly, any reasonable encoding of a Turing machine is context-free.
Most generally, the set of all strings is a regular language, and
clearly you can encode Turing machines with that language! The fact that
a language is, say, regular, does not guarantee that its semantics are
not Turing-complete.

So: What is meant by this? Is it abuse of notation for "Not only should
your syntax be regular or context-free, but your semantics should be as

Another claim I'm skeptical of (for the same reason) is that having
a correct parser will make it less likely for input to trigger bugs that
are "behind" the parser. That's not true, because if you have a bug that
is not related to parsing, that bug can be exploited without assuming
anything about the parser, so changing the parser won't affect the
exploitability of that bug. Given that, LangSec seems to boil down to
"Don't write code when you can have it generated for you."

Taylor Hornby
Disclaimer: I'm wrong a lot of the time

More information about the langsec-discuss mailing list