Notation¶
The syntax of FO and its extensions is specified
using the EBNF notation.
In this notation, [ a ]
zero or one occurrence of a
,
and { a }
represents zero, one, or more occurrences of a
.
The lexicon lists the tokens of the language.
They are specified
using Python Regular Expressions.
For example, an identifier satisfies [^\d\W]\w*\b
.
The regular expressions can be tested online, e.g., using pythex.org.
The tokens satisfying a regular expression are denoted by names in upper case, e.g., ID
.
Non-terminal symbols are specified by production rules and are denoted in CamelCase, e.g., TypeInterpretation
.
In the body of production rules, terminal symbols are denoted:
by referring to tokens in the lexicon, e.g.,
ID
;using string literals, e.g.,
'vocabulary'
;using string interpolation, e.g.,
'is_{CONSTRUCTOR}'
: the terminal symbol is the concatenation ofis_
and aCONSTRUCTOR
Some production rules in FO[Core] are modified or extended in other extensions. To allow more modularity, non-terminal strings can be defined by several production rules:
s ← 'A';
s ← 'B' | 'C';
is equivalent to:
s ← 'A' | 'B' | 'C';
The EBNF syntax is not sufficiently precise to define what is a well-formed FO[·] program. So, we complement it with an informal description of “well-formedness” conditions.