Handling errors in ANTLR4
Since I've had a little bit of a struggle with the two existing answers, I'd like to share the solution I ended up with.
First of all I created my own version of an ErrorListener like Sam Harwell suggested:
public class ThrowingErrorListener extends BaseErrorListener {
public static final ThrowingErrorListener INSTANCE = new ThrowingErrorListener();
@Override
public void syntaxError(Recognizer<?, ?> recognizer, Object offendingSymbol, int line, int charPositionInLine, String msg, RecognitionException e)
throws ParseCancellationException {
throw new ParseCancellationException("line " + line + ":" + charPositionInLine + " " + msg);
}
}
Note the use of a ParseCancellationException
instead of a RecognitionException
since the DefaultErrorStrategy would catch the latter and it would never reach your own code.
Creating a whole new ErrorStrategy like Brad Mace suggested is not necessary since the DefaultErrorStrategy produces pretty good error messages by default.
I then use the custom ErrorListener in my parsing function:
public static String parse(String text) throws ParseCancellationException {
MyLexer lexer = new MyLexer(new ANTLRInputStream(text));
lexer.removeErrorListeners();
lexer.addErrorListener(ThrowingErrorListener.INSTANCE);
CommonTokenStream tokens = new CommonTokenStream(lexer);
MyParser parser = new MyParser(tokens);
parser.removeErrorListeners();
parser.addErrorListener(ThrowingErrorListener.INSTANCE);
ParserRuleContext tree = parser.expr();
MyParseRules extractor = new MyParseRules();
return extractor.visit(tree);
}
(For more information on what MyParseRules
does, see here.)
This will give you the same error messages as would be printed to the console by default, only in the form of proper exceptions.
When you use the DefaultErrorStrategy
or the BailErrorStrategy
, the ParserRuleContext.exception
field is set for any parse tree node in the resulting parse tree where an error occurred. The documentation for this field reads (for people that don't want to click an extra link):
The exception which forced this rule to return. If the rule successfully completed, this is
null
.
Edit: If you use DefaultErrorStrategy
, the parse context exception will not be propagated all the way out to the calling code, so you'll be able to examine the exception
field directly. If you use BailErrorStrategy
, the ParseCancellationException
thrown by it will include a RecognitionException
if you call getCause()
.
if (pce.getCause() instanceof RecognitionException) {
RecognitionException re = (RecognitionException)pce.getCause();
ParserRuleContext context = (ParserRuleContext)re.getCtx();
}
Edit 2: Based on your other answer, it appears that you don't actually want an exception, but what you want is a different way to report the errors. In that case, you'll be more interested in the ANTLRErrorListener
interface. You want to call parser.removeErrorListeners()
to remove the default listener that writes to the console, and then call parser.addErrorListener(listener)
for your own special listener. I often use the following listener as a starting point, as it includes the name of the source file with the messages.
public class DescriptiveErrorListener extends BaseErrorListener {
public static DescriptiveErrorListener INSTANCE = new DescriptiveErrorListener();
@Override
public void syntaxError(Recognizer<?, ?> recognizer, Object offendingSymbol,
int line, int charPositionInLine,
String msg, RecognitionException e)
{
if (!REPORT_SYNTAX_ERRORS) {
return;
}
String sourceName = recognizer.getInputStream().getSourceName();
if (!sourceName.isEmpty()) {
sourceName = String.format("%s:%d:%d: ", sourceName, line, charPositionInLine);
}
System.err.println(sourceName+"line "+line+":"+charPositionInLine+" "+msg);
}
}
With this class available, you can use the following to use it.
lexer.removeErrorListeners();
lexer.addErrorListener(DescriptiveErrorListener.INSTANCE);
parser.removeErrorListeners();
parser.addErrorListener(DescriptiveErrorListener.INSTANCE);
A much more complicated example of an error listener that I use to identify ambiguities which render a grammar non-SLL is the SummarizingDiagnosticErrorListener
class in TestPerformance
.
What I've come up with so far is based on extending DefaultErrorStrategy
and overriding it's reportXXX
methods (though it's entirely possible I'm making things more complicated than necessary):
public class ExceptionErrorStrategy extends DefaultErrorStrategy {
@Override
public void recover(Parser recognizer, RecognitionException e) {
throw e;
}
@Override
public void reportInputMismatch(Parser recognizer, InputMismatchException e) throws RecognitionException {
String msg = "mismatched input " + getTokenErrorDisplay(e.getOffendingToken());
msg += " expecting one of "+e.getExpectedTokens().toString(recognizer.getTokenNames());
RecognitionException ex = new RecognitionException(msg, recognizer, recognizer.getInputStream(), recognizer.getContext());
ex.initCause(e);
throw ex;
}
@Override
public void reportMissingToken(Parser recognizer) {
beginErrorCondition(recognizer);
Token t = recognizer.getCurrentToken();
IntervalSet expecting = getExpectedTokens(recognizer);
String msg = "missing "+expecting.toString(recognizer.getTokenNames()) + " at " + getTokenErrorDisplay(t);
throw new RecognitionException(msg, recognizer, recognizer.getInputStream(), recognizer.getContext());
}
}
This throws exceptions with useful messages, and the line and position of the problem can be gotten from either the offending
token, or if that's not set, from the current
token by using ((Parser) re.getRecognizer()).getCurrentToken()
on the RecognitionException
.
I'm fairly happy with how this is working, though having six reportX
methods to override makes me think there's a better way.