Trying to split string into single words or "quoted words", and want to keep the quotes in the resulting array
You may use the following regular expression split
:
str = 'Presentation about "Test Driven Development"'
p str.split(/\s(?=(?:[^"]|"[^"]*")*$)/)
# => ["Presentation", "about", "\"Test Driven Development\""]
It splits if there is a space but only if the text following until the end contains an even number of "
. Be aware that this version will only work if all your strings are properly quoted.
An alternative solution uses scan
to read the parts of the string (besides spaces):
p str.scan(/(?:\w|"[^"]*")+/)
# => ["Presentation", "about", "\"Test Driven Development\""]
Just to extend the previous answer from Howard, you can add this method:
class String
def tokenize
self.
split(/\s(?=(?:[^'"]|'[^']*'|"[^"]*")*$)/).
select {|s| not s.empty? }.
map {|s| s.gsub(/(^ +)|( +$)|(^["']+)|(["']+$)/,'')}
end
end
And the result:
> 'Presentation about "Test Driven Development" '.tokenize
=> ["Presentation", "about", "Test Driven Development"]