What does "fragment" mean in ANTLR?

According to the Definitive Antlr4 references book :

Rules prefixed with fragment can be called only from other lexer rules; they are not tokens in their own right.

actually they'll improve readability of your grammars.

look at this example :

STRING : '"' (ESC | ~["\\])* '"' ;
fragment ESC : '\\' (["\\/bfnrt] | UNICODE) ;
fragment UNICODE : 'u' HEX HEX HEX HEX ;
fragment HEX : [0-9a-fA-F] ;

STRING is a lexer using fragment rule like ESC .Unicode is used in Esc rule and Hex is used in Unicode fragment rule. ESC and UNICODE and HEX rules can't be used explicitly.


A fragment is somewhat akin to an inline function: It makes the grammar more readable and easier to maintain.

A fragment will never be counted as a token, it only serves to simplify a grammar.

Consider:

NUMBER: DIGITS | OCTAL_DIGITS | HEX_DIGITS;
fragment DIGITS: '1'..'9' '0'..'9'*;
fragment OCTAL_DIGITS: '0' '0'..'7'+;
fragment HEX_DIGITS: '0x' ('0'..'9' | 'a'..'f' | 'A'..'F')+;

In this example, matching a NUMBER will always return a NUMBER to the lexer, regardless of if it matched "1234", "0xab12", or "0777".

See item 3

Tags:

Antlr