Reducing the time to calculate Collatz sequences

The expert on large-scale Collatz computations is Tomás Oliveira e Silva, who has a website with much information. Also worth a look is Eric Roosendaal's website.


For an interval $1..b$, all number congruent 2, 4, 5, or 8 modulo 9 can be ignored because of these productions staring with a smaller number:

$$6z + 1 \rightarrow 18z + 4 \rightarrow 9z + 2$$ $$6z + 3 \rightarrow 18z + 10 \rightarrow 9z + 5$$ $$6z + 5 \rightarrow 18z + 16 \rightarrow 9z + 8$$

$$8z + 3 \rightarrow 24z + 10 \rightarrow 12z + 5 \rightarrow 36z + 16 \rightarrow 18z + 8 \rightarrow 9z + 4$$

Similarly, all numbers congruent 5 modulo 8 can be ignored as these two paths coalesce after three steps:

$$8z + 5 \rightarrow 24z + 16 \rightarrow 12z + 8 \rightarrow 6z + 4$$ $$8z + 4 \rightarrow 4z + 2 \rightarrow 2z + 1 \rightarrow 6z + 4$$

This way we reduce the set of numbers to be checked by a factor of $\frac59\cdot\frac78 = \frac{35}{72}$.

Also the lower half of the interval can be skipped (as even numbers will "fall" into them), again saving factor 2. So a reduction to about one quarter can be gained(*).

But the more important speedup can be obtained by doing multiple steps at once as described in the Wikipedia. With a small table of $2^{17}$ elements, 17 steps at once can be done, leading to a speed up of maybe 10 (the steps are a bit more complicated than a direct computation). AFAIK this is the biggest table fitting into cache.

When searching for the maximizer, the number of yet needed steps to overreach the current maximum can be used for a lookup in a a table derived from this one to determine if the current value needs to be expanded further. This gives an additional factor 2 for the interval $1..10^{10}$.

This fantastic link gives a lot of more complicated optimization tips allowing to skip checking most of remaining numbers.


(*) For a general interval $a..b$, this applies only partially.


I'm not used to this way of looking at the collatz-problem and seem to be unable to really open my mind for this at the moment. But possibly the following observations help anyway.
If we look at the inverse operation, for odd $\small a_k $ define $ \small a_{k+1,A} = (2^A*a_k -1)/3 $ with all A that keep $ a_{k+1,A} $ odd and integer and beginning at $\small a=1$ then this forms a tree which should contain all odd positive integers if the collatz-conjecture is true.

Then we see, that with some $ \small a_k $ also all $ \small 4 a_k + 1, 4(4 a_k+1)+1, ... ,{4^j a_k -1 \over 4-1 }, \ldots $ are in that tree. Similarly, this can be expressed using 64 instead of 4 and so on. See the graphic at http://go.helms-net.de/math/collatz/aboutloop/collatzgraphs.htm where I collected some types of tree-representations and particularly at the one which is expressed in the base-4-numbersystem.

I don't know immediately, whether such a representation helps when the focus is on the consideration of an interval as is in the current question, but maybe I can come back to that problem later again with a fresh idea...

Tags:

Algorithms