Subroutines that take an optional block parameter

You cannot declare a subroutine with the same syntactic behaviour as sort. To check, try

prototype('CORE::sort')

which returns undef.


Yes.

Unfortunately it's a bit of a pain. You need to use the keyword API introduced in Perl 5.14. This means you need to implement it (and the custom parsing for it) in C and link it to Perl with XS.

Fortunately DOY wrote a great wrapper for the Perl keyword API, allowing you to implement keywords in pure Perl. No C, no XS! It's called Parse::Keyword.

Unfortunately this has major bugs dealing with closed over variables.

Fortunately they can be worked around using PadWalker.

Anyway, here's an example:

use v5.14;

BEGIN {
  package My::Print;
  use Exporter::Shiny qw( myprint );
  use Parse::Keyword { myprint => \&_parse_myprint };
  use PadWalker;
  
  # Here's the actual implementation of the myprint function.
  # When the caller includes a block, this will be the first
  # parameter. When they don't, we'll pass an explicit undef
  # in as the first parameter, to make sure it's nice and
  # unambiguous. This helps us distinguish between these two
  # cases:
  #
  #    myprint { BLOCK } @list_of_coderefs;
  #    myprint @list_of_coderefs;
  #
  sub myprint {
    my $block = shift;
    say for defined($block) ? map($block->($_), @_) : @_;
  }
  
  # This is a function to handle custom parsing for
  # myprint.
  #
  sub _parse_myprint {

    # There might be whitespace after the myprint
    # keyword, so read and discard that.
    #
    lex_read_space;
    
    # This variable will be undef if there is no
    # block, but we'll put a coderef in it if there
    # is a block.
    #
    my $block = undef;
    
    # If the next character is an opening brace...
    #
    if (lex_peek eq '{') {
      
      # ... then ask Parse::Keyword to parse a block.
      # (This includes parsing the opening and closing
      # braces.) parse_block will return a coderef,
      # which we will need to fix up (see later).
      #
      $block = _fixup(parse_block);
      
      # The closing brace may be followed by whitespace.
      #
      lex_read_space;
    }
    
    # After the optional block, there will be a list
    # of things. Parse that. parse_listexpr returns
    # a coderef, which when called will return the
    # actual list. Again, this needs a fix up.
    #
    my $listexpr = _fixup(parse_listexpr);
    
    # This is the stuff that we need to return for
    # Parse::Keyword.
    #
    return (
      
      # All of the above stuff happens at compile-time!
      # The following coderef gets called at run-time,
      # and gets called in list context. Whatever stuff
      # it returns will then get passed to the real
      # `myprint` function as @_.
      #
      sub { $block, $listexpr->() },
      
      # This false value is a signal to Parse::Keyword
      # to say that myprint is an expression, not a
      # full statement. If it was a full statement, then
      # it wouldn't need a semicolon at the end. (Just
      # like you don't need a semicolon after a `foreach`
      # block.)
      #
      !!0,
    );
  }
  
  # This is a workaround for a big bug in Parse::Keyword!
  # The coderefs it returns get bound to lexical
  # variables at compile-time. However, we need access
  # to the variables at run-time.
  #
  sub _fixup {
    
    # This is the coderef generated by Parse::Keyword.
    #
    my $coderef = shift;
    
    # Find out what variables it closed over. If it didn't
    # close over any variables, then it's fine as it is,
    # and we don't need to fix it.
    #
    my $closed_over = PadWalker::closed_over($coderef);
    return $coderef unless keys %$closed_over;
    
    # Otherwise we need to return a new coderef that
    # grabs its caller's lexical variables at run-time,
    # pumps them into the original coderef, and then
    # calls the original coderef.
    #
    return sub {
      my $caller_pad = PadWalker::peek_my(2);
      my %vars = map +($_ => $caller_pad->{$_}), keys %$closed_over;
      PadWalker::set_closed_over($coderef, \%vars);
      goto $coderef;
    };
  }
};

use My::Print qw( myprint );

my $start = "[";
my $end   = "]";

myprint "a", "b", "c";

myprint { $start . $_ . $end } "a", "b", "c";

This generates the following output:

a
b
c
[a]
[b]
[c]