How do I get the match data for all occurrences of a Ruby regular expression in a string?

You want

"abc12def34ghijklmno567pqrs".to_enum(:scan, /\d+/).map { Regexp.last_match }

which gives you

[#<MatchData "12">, #<MatchData "34">, #<MatchData "567">] 

The "trick" is, as you see, to build an enumerator in order to get each last_match.


My current solution is to add an each_match method to Regexp:

class Regexp
  def each_match(str)
    start = 0
    while matchdata = self.match(str, start)
      yield matchdata
      start = matchdata.end(0)
    end
  end
end

Now I can do:

numbers.each_match input do |match|
  puts "Found #{match[0]} at #{match.begin(0)} until #{match.end(0)}"
end

Tell me there is a better way.


I’ll put it here to make the code available via a search:

input = "abc12def34ghijklmno567pqrs"
numbers = /\d+/
input.gsub(numbers) { |m| p $~ }

The result is as requested:

⇒ #<MatchData "12">
⇒ #<MatchData "34">
⇒ #<MatchData "567">

See "input.gsub(numbers) { |m| p $~ } Matching data in Ruby for all occurrences in a string" for more information.

Tags:

Ruby

Regex