How to select the fastest approach for large numerical data computations?
The main question here is, there are too many approaches to perform the same operation. And normally, I didn't know which approach is the most optimal way in terms of efficiency.
Mathematica's performance is hard to predict, even more so than that of other high-level languages. There is no simple guideline you can follow. There will always be surprises and the behaviour will change from one version to the next.
Some insight into why Transpose
is faster here:
On my machine (macOS / M12.1) Timing
reports the lowest numbers for Part
, not for Transpose
. However, RepeatedTiming
(which is based on AbsoluteTiming
) reports a lower number for Transpose
.
In[16]:= test1[[All, 1]]; // Timing
Out[16]= {1.32521, Null}
In[17]:= test1[[All, 1]]; // RepeatedTiming
Out[17]= {1.41, Null}
In[18]:= First[Transpose[test1]]; // Timing
Out[18]= {2.08334, Null}
In[19]:= First[Transpose[test1]]; // RepeatedTiming
Out[19]= {0.80, Null}
Typically, this is an indication that some operations are done in parallel. Timing
measures the total time spent by each CPU core, while AbsoluteTiming
measures wall time.
A quick look at the CPU monitor confirms that indeed, Part
is single threaded (I see 100%) while Transpose
is multi-threaded (I see ~250%).
This explains the difference.
This is another observation, that sometimes in Mathematica, combining 2 functions is faster than using 1 function.
Jon McLoone " 10 Tips for Writing Fast Mathematica Code " has proposed that "Using fewer function will speed up". But not all the case, I think.
Do a simple test: Using a function inside a Table
to generate list.
In[11]:= a1 = Table[Power[i, 2], {i, 10^7}]; // AbsoluteTiming
Out[11]= {0.238681, Null}
Using Range first, and then put it in a functions .
In[12]:= a2 = Power[Range[10^7], 2]; // AbsoluteTiming
Out[12]= {0.0703124, Null}
Both are PackedArray.
In[16]:= Developer`PackedArrayQ /@ {a1, a2}
Out[16]= {True, True}
Maybe, Part
, and Table
are the big function? So they need to check something before doing the computational code? And Range
, and Transpose
is faster, because they are just doing one simple thing with less overhead?
Conclusions
- Don't use Table[f,{i,iMax}]
- But use f[Range[iMax]]
here is the performance proof:
testTable[n_] := AbsoluteTiming[Table[Power[i, 2], {i, 10^n}];]
testRange[n_] := AbsoluteTiming[Power[Range[10^n]];]
nList = {4, 5, 6, 7, 8};
t1 = First@testTable[#] & /@ nList;
t2 = First@testRange[#] & /@ nList;
ListLinePlot[{Transpose[{nList, t1}], Transpose[{nList, t2}]},
PlotLegends -> {"Table", "Range"}, Mesh -> All]