Are there guidelines for avoiding the unpacking of a packed array?
I will try to list some cases I can recall. The unpacking will happen when:
The result, or any intermediate step, is a ragged (irregular) array. For example
Range /@ Range[4]
To avoid this, you can try to use regular structures, perhaps padding your arrays with zeros appropriately
The result (or any intermediate step) contains numbers of different types, such as a mix of integers and reals, or when some of the elements are not of numeric types at all (symbols, expressions)
This usually happens by mistake for 1D lists. For multi-dimensional lists, there are several ways out. One is to convert all numbers to a single type (e.g. Reals to Integers or vice versa), when that is feasible. One such example is here.
Another way out is to store an array parts separately. For example, you have two arrays of the same length, but different element types, which logically belong together (such as a result of
Tally
operation on reals, for example, as illustrated below). While our usual reaction would be to store it in transposed (and thus unpacked) form, one can also store them as{list1,list2}
, which will be unpacked, but the partslist1
andlist2
inside it will remain packed - just don't transpose it. One example of such treatment is hereThis trick can be generalized to even ragged arrays. In the already cited post, I used it to convert an imported ragged array to a more space-efficient form, with elements being packed arrays, with
packed = Join @@Map[Developer`ToPackedArray, list]
Some of the numbers don't fit into the numerical precision limits (for example, very big integers). This can be insidious, because this may be data-dependent and happen in the middle of a computation, and it may not be clear what is going on.
Here, you can try to predict in advance whether or not this is likely, but other than that, there is little of what can be done, short of changing the algorithm.
The packed array is a part of an expression used with some rule-based code and subject to pattern-matching attempts. This will happen in cases when the match is not established before the pattern-matcher comes to the array. Here is an example:
Cases[f[g[Range[5]]], g[l_List] :> g[l^2], Infinity] During evaluation of In[14]:= Developer`FromPackedArray::unpack: Unpacking array in call to f. >>
{g[{1, 4, 9, 16, 25}]}
while this does not unpack:
f[g[Range[5]]] /. g[l_List] :> g[l^2]
f[g[{1, 4, 9, 16, 25}]]
This happened because
Cases
searches depth-first (and therefore reaches elements before heads, and then must unpack), whileReplaceAll
replaces from expressions to sub-expressions. I discussed this extensively here.This situation is typical for the pattern-matching - it will generally unpack. Note also that the pattern-matching goes inside held expressions, and will unpack even there:
FreeQ[Hold[Evaluate@Range[10]], _Integer]
The only way I know to generally prevent it is to make sure that the pattern will either match or be rejected before the pattern-matcher comes to a given packed array. Note that there are certain exceptions, e.g. like this:
MatchQ[Range[10], {__Integer}]
In which case, there is no unpacking.
In certain cases, you will not see the unpacking message, but the result returned by a function may be packed or unpacked, depending on its type. Here is an example:
tst = RandomInteger[10,20]
{6,9,9,4,6,4,0,9,7,1,3,2,2,0,7,2,1,0,7,5}
ntst = N@tst; tally = Tally[tst]
{{6,2},{9,3},{4,2},{0,3},{7,3},{1,2},{3,1},{2,3},{5,1}}
ntally = Tally[ntst]
{{6.,2},{9.,3},{4.,2},{0.,3},{7.,3},{1.,2},{3.,1},{2.,3},{5.,1}}
Developer`PackedArrayQ/@{tally,ntally}
{True,False}
You can see that the
ntally
was returned as an unpacked array, because it contains elements of different types, and there was no message to tell us about it, since indeed, nothing was unpacked - the result is a new array.As I metnioned already, one way here is to separate frequencies and elements themselves, and store them separately packed.
As elaborated by @Mr.Wizard,
Apply
leads to unpacking. This also refers toApply
at level 1 (@@@
). The way out here is just not to useApply
- chances are, that you can achieve your goal by other means, with packed lists.Map
will unpack on short lists, with lengths smaller than"SystemOptions"->"CompileOptions"->"MapCompileLength"
. This may come as a surprise, since we are used to the fact thatMap
does not unpack. For example, this unpacks:Map[#^2 &, Range[10]]
The way out here would be to change the system options (
"MapCompileLength"
) accordingly, to cover your case, or (perhaps even better), to manually pack the list withDeveloper`ToPackedArray
afterMap
is finished. This often does not matter much for small lists, but sometimes it does.Map
will also unpack for any function which it can not auto-compile:ClearAll[f]; f[x_] := x^2; Map[f, Range[1000]]
while this does not unpack:
Map[#^2 &, Range[1000]]
The solution here is to avoid using rule-based functions in such cases. Sometimes one can also, perhaps, go with some more exotic options, such as using something similar to a
withGlobalFunctions
macro from this answer (which expands certain rule-based functions at run-time).Functions like
Array
andTable
will produce unpacked arrays for functions or expressions which they can not auto-compile. They will not produce any warnings. For example:Array[f, {1000}] // Developer`PackedArrayQ
False
Similar situation for other functions which have special compile options.
In all these cases, the same advice: make your functions/expressions (auto)compilable, and / or change the system settings. Sometimes you can also manually pack the resulting list afterwords, as an alternative.
While this reiterates on one of the previous points, innocent-looking functions which combine packed arrays of different types will often unpack both:
Transpose[{Range[10], N@Range[10]}]
In cases like this, often (also as mentioned already) you can live with such lists as they are, without transposing them. Then, the sub-lists will remain packed.
When you use
Save
to save some symbol's definitions andGet
to get them back, packed arrays will be generally unpacked duringSave
. This is not the case withDumpSave
, which is highly recommended for that. Also,Compress
does not unpack.Import
andExport
will often not preserve packed arrays. The situation is particularly grave withImport
, since often it takes huge memory (and time) to import some data, which could be stored as a packed array, but is not recognized as such.
There are probably many more cases. I intend to add to this list once I recall some more, and invite others to contribute. One characteristic feature of unpacking is, however, general: whenever a final result or some intermediate expressions can not be represented as regular arrays (tensors) of the same basic type (Integer
, Real
or Complex
), most of the time unpacking will happen.
I'll be the first to simply mention that you can use On["Packing"]
and then observe any unpacking that occurs in the course of evaluation. I'm not sure there is a more systematic way to approach this, other than to compile a list of functions that do or do not preserve packing.
There are always a few surprises for me, such as PadRight
on a ragged array composed of packed vectors not returning a packed array, but I don't think there is any way to guess that beforehand without prior knowledge of the implementation.
I guess one guideline is that functions that operate on a list or array directly often handle packed arrays, while Mapping or Applying usually do not.
For example:
a = Range@30;
Tr[a]
Total[a]
Plus @@ a
Tr
and Total
do not unpack, but Plus @@
does. Likewise:
a = RandomInteger[99, {5, 5}]
Total[a, {2}]
Plus @@@ a
Tr /@ a
Plus @@ # & /@ a
Total
does not unpack; Plus @@@
and Tr /@
unpack one level; the last line fully unpacks.
This is not really a guideline, but it is so basic/common I felt it deserved to be mentioned.
Be careful when using Set
with Part
<< Developer`
packed4 = packed3 = packed2 = packed = ConstantArray[0, 100];
packed[[{1, 2, 3, 4, 5}]] = {6, 7, 8, 9, 3};
packed2[[1 ;; 5]] = {6, 7, 8, 9, 3};
packed3[[1]] = 1;
packed4[[{1, 2, 3, 4, 5}]] = {6, 7, 8, 9, 3} // ToPackedArray;
PackedArrayQ /@ {packed, packed2, packed3, packed4}
{False, False, True, True}