4.9 Lexer modes

Lexer modes can be used to change the set of patterns that are matched by the lexer. A common use for lexer modes is to match strings.

Example:

<<
string mystringvalue;
>>

tokenid str;

# String processing
/"/ <<
  mystringvalue = "";
  $mode(string);
>>
string: /[^"]+/ << mystringvalue ~= match; >>
string: /"/ <<
  $mode(default);
  return $token(str);
>>

A lexer mode is defined by placing the name before a colon (:) character that precedes a token or pattern statement. The token or pattern statement is restricted to only applying if the named mode is active.

By default, the active lexer mode is named default. A $mode() call within a lexer code block can be used to change lexer modes.

In the above example, when the lexer in the default mode sees a doublequote (") character, the lexer code block will clear the mystringvalue variable and will set the lexer mode to string. When the lexer begins looking for patterns to match against the input, it will now look only for patterns tagged for the string lexer mode. Any non-" character will be appended to the mystringvalue string. A " character will end the string lexer mode and return to the default lexer mode. It also returns the str token now that the token is complete.

Note that the token name str above could have been string instead - the namespace for token names is distinct from the namespace for lexer modes.