Detecting a repeating cycle in a sequence of numbers (python)

I may not be properly understanding this, but I think there is a very simple solution with regex.

(.+ .+)( \1)+

Here is an example:

>>> regex = re.compile(r'(.+ .+)( \1)+')
>>> match = regex.search('3 0 5 5 1 5 1 6 8')
>>> match.group(0)    # entire match
'5 1 5 1'
>>> match.group(1)    # repeating portion
'5 1'
>>> match.start()     # start index of repeating portion
6

>>> match = regex.search('2 0 6 3 1 6 3 1 6 3 1')
>>> match.group(1)
'6 3 1'

Here is how it works, (.+ .+) will match at least two numbers (as many as possible) and place the result into capture group 1. ( \1)+ will match a space followed by the contents of capture group 1, at least once.

And an extended explanation for the string '3 0 5 5 1 5 1 6 8':

  • (.+ .+) will originally match the entire string, but will give up characters off the end because ( \1)+ will fail, this backtracking will occur until (.+ .+) cannot match at the beginning of the string at which point the regex engine will move forward in the string and try again
  • This will happen until the capture group starts at the second 5, it will give up characters at the end until '5 1' is captured, at which point the regex is looking for any number of ' 5 1' for ( \1)+, it will of course find this and the match will succeed

Your question is really "do all items from x:x+k match items from y:y+k". That is, does a k-length subset occur twice in the line?

And you want x:x+k non-overlapping with y:y+k. The easy way to do this is to define y as x plus some offset, d. If you assure that k <= d < len(line)-x-k, then you're always looking within the boundaries of the line.

You'll then vary k from 1 to len(line)//2, looking for various length duplicates at a given offset from each other.

The offset from x to y, d, will vary between 1 and len(line)-x-k.

The starting position for x, similarly will vary from 0 to len(line)//2.

So, the "all" part is something like this: all( line[i] == line[i+d] for i in range(x,x+k) ) for various legal values of d, x and k.