Count the number of Hamming distance sequences
Haskell, score 9
import Data.Bits
import Data.List
import qualified Data.IntSet as S
main = mapM_ (print . S.size . S.fromList . hd) [1..9]
hd :: Int -> [Int]
hd n = [foldl' (\s e->s*m+e) 0 [popCount $ xor p (shiftR t i .&. m)|i<-[(0::Int)..n-1]] | let m=2^n-1,p<-[(0::Int)..2^n-1],t<-[(0::Int)..2^(2*n-1)-1]]
Compile with -O3
. It takes 6min35s on my 6 year old laptop hardware to run up to n=9
, so maybe it's under 5min on the reference hardware.
> time ./113785
2
9
48
297
2040
15425
125232
1070553
9530752
real 6m35.763s
user 6m27.690s
sys 0m5.025s
JavaScript, score 10
findHamming = m => {
if (m < 2) return 2;
let popcnt = x => x && (x & 1) + popcnt(x >> 1);
let a = [...Array(1 << m)].map((_,i) => popcnt(i));
let t = 0;
let n = (1 << m) - 1;
for (let c = 0; c <= m; c++) {
for (let g = 0; g <= c; g++) {
let s = new Set;
for (let i = 1 << m; i--; ) {
for (let j = 1 << (m - 1); j--; ) {
if (a[i ^ j + j] != c) continue;
for (let k = 1 << m - 1; k--; ) {
if (a[i ^ k] != g) continue;
let f = j << m | k;
let h = 0;
for (l = m - 1; --l; ) h = h * (m + 1) + a[i ^ f >> l & n];
s.add(h);
}
}
}
t += s.size * (g < c ? 2 : 1);
}
}
return t;
};
let d = Date.now(); for (let m = 1; m < 11; m++) console.log(m, findHamming(m), Date.now() - d);
Explanation: Calculating n=10
is difficult because there are over two billion pairs and over 26 billion potential sequences. In order to speed things up I split the calculation up into 121 bins. Because the sequences are invariant under bitwise complement, I can assume without loss of generality that the middle bit of T
is zero. This means that I can determine the first and last elements of the sequence independently from the the top n-1
and bottom n-1
bits of T
. Each bin corresponds to a different pair of first and last elements; I then loop through all the possible sets of top and bottom bits that correspond to each bin and calculate the remaining elements of the sequence, finally counting the unique sequences for each bin. It then remains to total all 121 bins. Originally taking 45 hours, this now completed in a little under three and a half minutes on my AMD FX-8120. Edit: Thanks to @ChristianSievers for a 50% speedup. Full output:
1 2 0
2 9 1
3 48 1
4 297 2
5 2040 7
6 15425 45
7 125232 391
8 1070553 1844
9 9530752 15364
10 86526701 153699
C++, score 10 11
This is a translation of @Neil's answer into C++, with some simple parallelization. n=9
completes in 0.4 seconds, n=10
in 4.5 seconds, and n=11
in approximately 1 minute on my 2015 Macbook Pro. Also, thanks to @ChristianSievers. Due to his comments on @Neil's answer, I noticed some additional symmetries. From the original 121 buckets (for n=10
), to 66 buckets when accounting for reversal, I've gotten down to just 21 buckets.
#include <iostream>
#include <cstdint>
#include <unordered_set>
#include <thread>
#include <future>
#include <vector>
using namespace std;
constexpr uint32_t popcnt( uint32_t v ) {
uint32_t c = v - ( ( v >> 1 ) & 0x55555555 );
c = ( ( c >> 2 ) & 0x33333333 ) + ( c & 0x33333333 );
c = ( ( c >> 4 ) + c ) & 0x0F0F0F0F;
c = ( ( c >> 8 ) + c ) & 0x00FF00FF;
c = ( ( c >> 16 ) + c ) & 0x0000FFFF;
return c;
}
template<uint32_t N>
struct A {
constexpr A() : arr() {
for( auto i = 0; i != N; ++i ) {
arr[i] = popcnt( i );
}
}
uint8_t arr[N];
};
uint32_t n = ( 1 << M ) - 1;
constexpr auto a = A < 1 << M > ();
uint32_t work( uint32_t c, uint32_t g, uint32_t mult ) {
unordered_set<uint64_t> s;
// Empirically derived "optimal" value
s.reserve( static_cast<uint32_t>( pow( 5, M ) ) );
for( int i = ( 1 << M ) - 1; i >= 0; i-- ) {
for( uint32_t j = 1 << ( M - 1 ); j--; ) {
if( a.arr[i ^ j + j] != c ) {
continue;
}
for( uint32_t k = 1 << ( M - 1 ); k--; ) {
if( a.arr[i ^ k] != g ) {
continue;
}
uint64_t f = j << M | k;
uint64_t h = 0;
for( uint32_t l = M - 1; --l; ) {
h = h * ( M + 1 ) + a.arr[i ^ ( f >> l & n )];
}
s.insert( h );
}
}
}
return s.size() * mult;
}
int main() {
auto t1 = std::chrono::high_resolution_clock::now();
if( M == 1 ) {
auto t2 = std::chrono::high_resolution_clock::now();
auto seconds = chrono::duration_cast<chrono::milliseconds>( t2 - t1 ).count() / 1000.0;
cout << M << ": " << 2 << ", " << seconds << endl;
return 0;
}
uint64_t t = 0;
vector<future<uint32_t>> my_vector;
if( ( M & 1 ) == 0 ) {
for( uint32_t c = 0; c <= M / 2; ++c ) {
for( uint32_t g = c; g <= M / 2; ++g ) {
uint32_t mult = 8;
if( c == M / 2 && g == M / 2 ) {
mult = 1;
} else if( g == c || g == M / 2 ) {
mult = 4;
}
my_vector.push_back( async( work, c, g, mult ) );
}
}
for( auto && f : my_vector ) {
t += f.get();
}
} else {
for( uint32_t c = 0; c <= ( M - 1 ) / 2; ++c ) {
for( uint32_t g = c; g <= M - c; ++g ) {
uint32_t mult = 4;
if( g == c || g + c == M ) {
mult = 2;
}
my_vector.push_back( async( work, c, g, mult ) );
}
}
for( auto && f : my_vector ) {
t += f.get();
}
}
auto t2 = std::chrono::high_resolution_clock::now();
auto seconds = chrono::duration_cast<chrono::milliseconds>( t2 - t1 ).count() / 1000.0;
cout << M << ": " << t << ", " << seconds << endl;
return 0;
}
Use the following script to execute the code:
#!/usr/bin/env bash
for i in {1..10}
do
clang++ -std=c++14 -march=native -mtune=native -Ofast -fno-exceptions -DM=$i hamming3.cpp -o hamming
./hamming
done
The output was as follows: (The format is M: result, seconds
)
1: 2, 0
2: 9, 0
3: 48, 0
4: 297, 0
5: 2040, 0
6: 15425, 0.001
7: 125232, 0.004
8: 1070553, 0.029
9: 9530752, 0.419
10: 86526701, 4.459
11: 800164636, 58.865
n=12
took 42 minutes to calculate on a single thread, and gave a result of 7368225813.