Notation

The syntax of FO and its extensions is specified using the EBNF notation. In this notation, [ a ] zero or one occurrence of a, and { a } represents zero, one, or more occurrences of a.

The lexicon lists the tokens of the language. They are specified using Python Regular Expressions. For example, an identifier satisfies [^\d\W]\w*\b. The regular expressions can be tested online, e.g., using pythex.org.

The tokens satisfying a regular expression are denoted by names in upper case, e.g., ID. Non-terminal symbols are specified by production rules and are denoted in CamelCase, e.g., TypeInterpretation. In the body of production rules, terminal symbols are denoted:

  • by referring to tokens in the lexicon, e.g., ID;

  • using string literals, e.g., 'vocabulary';

  • using string interpolation, e.g., 'is_{CONSTRUCTOR}': the terminal symbol is the concatenation of is_ and a CONSTRUCTOR

Some production rules in FO[Core] are modified or extended in other extensions. To allow more modularity, non-terminal strings can be defined by several production rules:

s ← 'A';
s ← 'B' | 'C';

is equivalent to:

s ← 'A' | 'B' | 'C';

The EBNF syntax is not sufficiently precise to define what is a well-formed FO[·] program. So, we complement it with an informal description of “well-formedness” conditions.