[langsec-discuss] TJSON: Tagged JSON with Rich Types

Sven M. Hallberg pesco at khjk.org
Wed Oct 26 10:19:54 UTC 2016


On Oct 25, 2016, at 5:15 PM, Tony Arcieri <bascule at gmail.com> wrote:
> https://www.tjson.org/

Neat!

You describe the generic form of "<tag>:..." in BNF, but you can also
describe all your higher-level requirements in the grammar. Are there
plans to produce a fully grammatical specification?

You make a point of the language being a subset of JSON which "can be
understood by existing JSON parsers". A grammar for the subset is needed
to perform proper recognition before processing by a generic JSON
parser.


Jeffrey Goldberg <jeffrey at goldmark.org> on Wed, Oct 26 2016:
> If the UTF8 strings aren't normalized, you will get different hashes
> for visually and semantically identical strings.

Along the same line, beware of surrogate pairs escape-encoded in the
string. E.g.:

  "s:\uXXXX\uXXXX"

Here is the relevant piece of ABNF I once wrote for my JSON-like pet
project^1:

  esc-unicode = u (u-basic / u-surro)

  u-surro = u-surro-hi backslash u u-surro-lo
  u-basic = (r0C / rEF) hexdig hexdig hexdig       ; not D...
          / dD r07 hexdig hexdig                   ; D[0-7]..
  u-surro-hi = dD r8B hexdig hexdig                ; D[8-B]..
  u-surro-lo = dD rCF hexdig hexdig                ; D[C-F]..

  ; hex ranges
  r0C = %x30-39 / %x41-43 / %x61-63           ; 0-9 A B C
  rEF = %x45-46 / %x65-66                     ; E F
  r07 = %x30-37                               ; 0-7
  r8B = %x38-39 / %x41-42 / %x61-62           ; 8 9 A B
  rCF = %x43-46 / %x63-66                     ; C D E F

  u = %x75                                    ; u
  dD = %x44 / %x64                            ; d D

^1: http://khjk.org/log/2012/jun/datalang.html


-pesco


More information about the langsec-discuss mailing list