Exchanging numpy arrays between Python and Mathematica?
You can use the binary format to speed up the process:
python side
import numpy as np
array = np.random.rand(100000000);
array.astype('float32').tofile('np.dat')
Mathematica side
data =
BinaryReadList["np.dat", "Real32"]; // AbsoluteTiming
(* {2.56679, Null} *)
data // Dimensions
(* {100000000} *)
Here is a template to read a numpy binary ".npy" file created simply by
numpy.save(filename,array)
this file format has the array structure encoded as a python string that we need to parse..
{'descr': '<f8', 'fortran_order': False, 'shape': (3, 4, 5), }
the byte order is also encoded so that this format is portable across hardware.
all of this code is parsing a small header and the actual data read is a single BinaryReadList
so it should be very fast..
getnpy[file_] :=
Module[{a, f = OpenRead[file, BinaryFormat -> True], version,
headerlen, header, dims, type, typ, byto},
a = If[
BinaryRead[f, "Byte"] == 147 &&
BinaryReadList[f, "Character8", 5] == Characters["NUMPY"] ,
version = BinaryReadList[f, "Byte", 2];
headerlen = BinaryRead[f, "Integer16", ByteOrdering -> -1];
header = StringJoin@BinaryReadList[f, "Character8", headerlen];
dims = StringCases[header,"'shape':" ~~ Whitespace ~~ "(" ~~
s : {NumberString, ",", Whitespace} .. ~~ ")" :>
ToExpression[
"{" ~~
If[StringTake[s, -1] == ",", StringDrop[s, -1], s] ~~
"}"]][[1]];
type =
StringCases[header,
"'descr':" ~~ Whitespace ~~
Shortest["'" ~~ s : _ ... ~~ "'"] :> s][[1]];
byto =
Switch[StringTake[type, 1], "<", -1, ">", 1, _, $ByteOrdering];
If[MemberQ[{"<", ">","|","="}, StringTake[type, 1]],
type = StringDrop[type, 1]];
typ =
Switch[ type ,
"f8" , "Real64" ,
"i8" , "Integer64" ,
_ , Print["unknown type", header]; 0];
If[typ != 0,
ArrayReshape[BinaryReadList[f, typ, ByteOrdering -> byto],
dims], 0 ], Print["not a npy"]; 0];
Close[f]; a];
getnpy["test.npy"]
note I only put a couple of types you might encounter in the Switch
. See the manual under BinaryRead
if you need to add others. Also I did not implement the 'fortran_order' key , just assume the default false.