What's the meta-object rule for naming grammar rules
Note also that FALLBACK token in grammars perform a similar function to the FALLBACK method in classes. It is invoked, with the token name when an unknown token is encountered in a grammar.
Changing your example a bit:
grammar g {
token TOP { <blah> };
token FALLBACK($name) { {note "$name called" } 'defined' }
};
say g.parse('defined')
Produces
blah called
「defined」
blah => 「defined」
This is almost entirely about multiple awkward bugs.
item
etc.
See RT#127945 -- Mu
methods cannot be used as grammar tokens due to default Actions class. Also token name confilct with internal name ?. Unfortunately this isn't easy to fix.
An explanation of this bug and its impact follows.
Per the Actions mechanism, if a grammar rule matches, the .parse
call immediately tries to call a correspondingly named action method.
If you don't explicitly pass an actions class/object to the .parse
method then it uses the default, which is Mu
. Then, when a rule in your grammar matches, it looks for a Mu
method with the same name. If it doesn't find one, all is well. But if it finds one then it calls that method on Mu
with the current Match
object as the first and only argument. In almost all cases that'll go badly. item
is an example of this.
If you do tell the .parse
method to use a particular actions class/object, another wrinkle arises:
grammar g { rule all { all } };
class actions { }
g.parse: 'all',
rule => 'all',
actions => actions,
This yields a similar error to item
, except this time the all
method comes from Any
. This is because the actions class's MRO includes Any
:
say class actions { }.^mro ; # ((actions) (Any) (Mu))
You can eliminate this wrinkle by declaring your actions classes with is Mu
:
grammar g { rule all { all } };
class actions is Mu { }
g.parse: 'all',
rule => 'all',
actions => actions,
This works fine because now the actions only inherit from Mu
-- and Mu
doesn't have an all
method.
It would be great if you could inherit from nothing, but you can't; is Mu
is as minimal as you can get.
What can we conclude about this first bug?
Because newer versions of Perl 6 and/or Rakudo may ship with new Mu
methods, the safest thing to do to defend against this bug is to always declare an actions class and always declare a method corresponding to every single rule in your grammar. If you do this you don't need to follow any naming rules to avoid this bug.
TWEAK
etc.
I will file an RT bug about this if I can't find an existing one.
Golfed:
grammar g { rule TWEAK {} }
This blows up at compile-time (immediately after parsing the closing curly brace of the grammar declaration). So this is definitely not the same bug as the item
bug -- because the latter is due to the run-time Actions mechanism that only kicks in after a rule matches.
This does not blow up:
grammar g { method TWEAK {} }
Perhaps, as part of creating/finalizing a grammar package, some code introspects and/or manipulates any TWEAK
"method" found in the new grammar package in a way that works fine if it's an ordinary method but blows up if it's not.
However, other submethods like
FALLBACK
have no problem at all
TWEAK
and BUILD
methods or submethods in a class are part of standard object construction. They have a very different role to play than FALLBACK
(which is called if a method is missing).
What can we conclude about this second bug?
There's clearly something very specific going on with TWEAK
and BUILD
and they may well be the only two rule names with the problem they exhibit. So just avoid those two names and you'll hopefully be clear of this bug.
Accidentally using built-in rule names
See RT#125518 -- Grammar 'ident' override behaviour.
You can override built-in rules by just specifying your own version.
As dwarring notes "It certainly causes confusion if you accidentally declare [a rule] with the same name as a built-in rule.".
So the key question is, what's the definitive source for knowing built-in rules and how might one manage things given that they may change over time?
(Yes, very vague, I know. Also, I think Perl 6's built-ins must necessarily extend NQP's and that seems likely to be relevant. Also, there are multiple slangs in each overall language and perhaps that's relevant. I plan to discuss this issue more fully in a later edit.)
Other relevant bugs
See also Moritz' answer.
The rule seems to be "if the grammar engine itself calls a method, you cannot redefine it as a regex/token".
Sadly, there is no documentation about this, and most likely it is very implementation dependent.