Recursive acronyms
Regex, .NET flavour, 62 bytes
(?i)(?<=^\w(?<c>\w)*)( \k<c>(?<-c>)\w+| (?<w>\w+))*$(?(c)(?!))
You can test it here. If the input is a recursive acronym, this will yield a match, and capturing group w
will contain all function words. If it isn't, then there will be no match.
This does preserve capitalisation of the function words (but matches case-insensitively).
Unfortunately, the tester doesn't display the entire stack of a named capturing group, but if you used it anywhere in .NET, the w
group would contain all function words in order.
Here is a C# snippet to prove that:
var pattern = @"(?i)(?<=^\w(?<c>\w)*)( \k<c>(?<-c>)\w+| (?<w>\w+))*$(?(c)(?!))";
var input = new string[] {
"RPM Package Manager",
"Wine is not an emulator",
"GNU is not Unix",
"Golf is not an acronym",
"X is a valid acronym"
};
var r = new Regex(pattern);
foreach (var str in input)
{
var m = r.Match(str);
Console.WriteLine(m.Success);
for (int i = 0; i < m.Groups["w"].Captures.Count; ++i)
Console.WriteLine(m.Groups["w"].Captures[i].Value);
}
Here is a quick explanation. I'm using .NET's balancing groups to build a stack of the acronym letters in the named group c
, with this snippet
^\w(?<c>\w)*
The trick is that I need the second letter on top of the stack and the last one at the bottom. So I put all of this in a lookbehind that matches the position after the acronym. This helps, because .NET matches lookbehinds from right to left, so it encounters the last letter first.
Once I got that stack, I match the rest of the string word for word. Either the word begins with the letter on top of the acronym stack. In that case I pop that letter from the stack:
\k<c>(?<-c>)\w+
Otherwise, I match the word anyway and push onto the w
stack which will then contain all function words:
(?<w>\w+)
At the end I make sure I reached the end of the string with $
and also make sure that I've used up all letters from the acronym, by checking that the stack is empty:
(?(c)(?!))
Test it on ideone.
Python (158, without regex)
It's not that I don't like regexes. It's that I don't know them.
def f(x):
s=x.lower().split();w=list(s[0][1:]);s=s[1:];o=[]
if not w:return 1,s
[w.pop(0)if i[0]==w[0]else o.append(i)for i in s]
return(0,)if w else(1,o)
Oh, I also had an ungolfed version:
def acronym(string):
scentence = string.lower().split()
word = scentence[0][1:]
scentence = scentence[1:]
over = []
if not word: return 1, scentence
for item in scentence:
if item[0] == word[0]:
word = word[1:]
else:
over.append(item)
if word:
return 0,
return 1,over
GolfScript, 51 50 chars
{32|}%" "/(1>\{.1<2$1<={;1>}{\}if}/{]!}{]`1" "@}if
It probably can be golfed further. Takes input on STDIN. The boolean is 0/1.
Test online
Explanation:
{32|}% # change everything to lower-case
" "/ # splits the string by spaces
(1> # takes the first word out and removes the first letter
\ # moves the list of remaining words in front of the acronym word
{ # for every word:
.1<2$1<= # compares the first letter of the word with
# the next unmatched letter of the acronym
{;1>} # if they are the same, discard the word and the now-matched letter
{\} # otherwise store the word in the stack
if # NB. if all letters have been matched, the comparison comes out as false
}/
{]!} # if there are still unmatched letters, return 0 (`!` non-empty list)
{]`1" "@} # otherwise, return 1, and display the list of function words
if