[langsec-discuss] proven-correct parsers

Matthew Wilson diakopter at gmail.com
Thu Jun 2 03:09:47 UTC 2016

argh, I meant "the only reason _not_ to prefer [CFLs over RLs]"

On Wed, Jun 1, 2016 at 8:07 PM, Matthew Wilson <diakopter at gmail.com> wrote:
>> e.g., do we really need a CFL in a given context or would a RL suffice?
> Well, as I understand it, the only reason to prefer CFLs is because
> their (efficient!) parsers are (traditionally!) harder to verify than
> those for regular languages...
> Here's a very hand-wavey speculation: Since a Turing-complete bytecode
> can be encoded into a datastream that is encoded (at a lower level) by
> a regular language, what really matters is how the resulting data/AST
> is interpreted after being "parsed" - at its highest level.
> Therefore, all programs/functions should be verified as formalized
> parser-interpreters of formal languages, in order to solve the problem
> holistically.  Taken to the extreme, we would then view all machines
> (and their sub-machines) as interfaces whose inputs (both human and
> programmatic) are formally type-checked and implementations are
> formally verified, top to bottom, all the way back to system-wide
> security and resource-consumption properties/constraints.  This would,
> of course, require a total re-think on how to interact with computers,
> since our every input (including shell command-lines and GUI command
> sequences) would need to be type-checked to effectively the same
> extent as every other bit of program code in the system/network. In
> other words, every input language is an embedded DSL of dependent type
> theory, with a verified implementation... and the process of learning
> computer programming transitions to learning how to express all your
> problems in such formalizations.
> The artifacts (that I've run across in my secondary/tertiary research)
> that approach this goal are those formalized in dependent
> intuitionistic type theory, including this one:
> http://keldysh.ru/papers/2013/prep2013_73_eng.pdf and
> https://github.com/ilya-klyuchnikov/ttlite   ... and for those whose
> curiosity that paper tickles, here's another project based on the same
> foundational formalization (but building from the opposite direction
> of the implementation stack):
> https://gitlab.com/joaopizani/piware-paper-2015/raw/types2015-postproceedings-submitted/main.pdf
> and https://gitlab.com/joaopizani/piware
> Apologies if my suggestion is too hand-wavey for anyone..  For a far
> more informative and useful explanation of this theory of progress
> toward holistic formalization, see the upcoming book by Edwin Brady:
> https://www.manning.com/books/type-driven-development-with-idris
> Note: The current implementation of Idris is not itself formalized
> (and it doesn't emit proofs) so it doesn't yet fit into the same
> strain of development as the other projects, but it's a superb
> learning environment for principles of formalization in type theory.
> On Wed, Jun 1, 2016 at 6:30 PM, Guilherme Salazar <gmesalazar at gmail.com> wrote:
>> Hey,
>> On Wed, Jun 1, 2016 at 9:57 PM, Matthew Wilson <diakopter at gmail.com> wrote:
>>> I hadn't seen this paper/repo referenced on this list or any of the
>>> links from langsec.org, so I thought it might be helpful to point out.
>>> I realize the answer to my question is generally agreed upon by the
>>> list participants...
>>> On second thought, it occurs to me that perhaps you didn't see the
>>> link I included (since you didn't quote it) to the parsing-parses
>>> Github repo of JasonGross?
>> I did see it; thanks for sharing : )
>> The reason I asked was that, the way I see it, LangSec's core idea is
>> not only to build correct parsers, but to reason if the language the
>> parser parses is what we really need to deal with -- e.g., do we
>> really need a CFL in a given context or would a RL suffice?
>> Cheers,
>> G. Salazar
> --
> Sent by an Internet

Sent by an Internet

More information about the langsec-discuss mailing list