[langsec-discuss] TJSON: Tagged JSON with Rich Types

Sven M. Hallberg pesco at khjk.org
Fri Nov 4 12:38:35 UTC 2016


Tony Arcieri <bascule at gmail.com> on Thu, Nov 03 2016:
> On Thu, Nov 3, 2016 at 2:46 AM, Sven M. Hallberg <pesco at khjk.org> wrote:
>
>> >     {"dialpad:A<A<i>>": [["1","2","3"], ["4","5","6"], ["7","8","9]]}
>>
>> Now this looks definitely context-sensitive. One nested structure on the
>> right of the ':' depending on another to the left. You can no longer get
>> away with a grammar but you'll have all the fun of a type system.
>
> The grammar is certainly still context-free:

Indeed, the grammar for a conveniently chosen context-free superset is
still context-free. The actual language is not.


> But, as you noted, this does add a sort of type system to the language,
> such that it's now possible to express documents which don't
> typecheck.

But more: You'll find you can't parse without "typechecking" anymore.
Surely you agree that parsing an integer is not the same as parsing
base64. Maybe there is an elegant grammatical formalism to describe your
language, but that doesn't change my point: It's a significant
complication over before.

For one thing, it's harder to write a parser now that reuses an existing
JSON framework. Before you could do this:

1) Full recognition by an automaton autogenerated from a CFG.
   (Also yay decidable equivalence.)
2) Interpretation by existing JSON parser.
3) Simple visitor pattern on result to convert tagged strings to their
   native representations.


> Objects as self-describing product types, so no further type information is
> necessary.

...for parsing. I think your new proposal is less elegant than the
original because it confuses syntax and semantics. Above, you treat
things like semantics that should be syntax (valid forms of values,
depending on tag) and here you leave things that could be useful at a
semantic level (expected object structure) out because they are
unnecessary for syntax.

To be clear, I wouldn't advocate putting an entire schema into the tag.
I would advocate not putting half of one in it either.


Of course I just offer my thoughts in the hope that they are of some
insight.

-pesco


More information about the langsec-discuss mailing list