Anyone know whether Nvidia's GPUs are big or little-endian?
See: https://devtalk.nvidia.com/default/topic/366773/cuda-programming-and-performance/endian-mode-of-the-device/post/2630674/#2630674
All of the supported CUDA platforms use little-endian CPUs, and cudaMemcpy() can copy data structures to the device without knowing the data format, so I would assume the GPU is also little-endian. The GPU might support both big and little endian execution (as some CPUs also do this) as a hedge against future CUDA platforms being big endian.
My guess is the answer has to be either "little-endian" or "both".
Per the Hardware Implementation section of the CUDA guide, little-endian.