Login Register

Markup previews project: basic parser implemented

This week I was working on a parser implementation and made some progress.

The parser I implemented consists of language definition part (where all tokens are defined), function definitions which process a particular token and a scanner that processes text, detects tokens and applies these functions.

To define a new token you need to add a new entry to tokens array in a language definition part, eg:

tokens: [
    // token can be defined using "start" and "end" regexps
    { start: '\\[', end: '\\]',
      mode: 'test',
      token: dojox.markup.TestToken
    },
    // or, for simpler cases, it can be defined using "regex"
    { regex: '[-A-Za-z0-9_\\.\\, ]*',
      mode: 'text',
      token: dojox.markup.TextToken
    },
    ...
]

For now scanner can process nested tokens and can process more complex cases, like token definitions with equal "start" and "end" or two tokens with equal "start" and different "end", etc

For now it can't:

  • process wrong nested tokens (eg {some [text}] )
  • it doesn't check the "allowed_children" property (which is used to define which tokens can be inside a particular token)
  • if anything goes wrong or it cant detect any token the error is generated

You can play with it here: soc week#02 dojox.markup tests.

Your comments are welcome.