best way to get files list of big directory on python?
If you have a directory that is too big for libc readdir() to read it quickly, you probably want to look at the kernel call getdents() (http://www.kernel.org/doc/man-pages/online/pages/man2/getdents.2.html ). I ran into a similar problem and wrote a long blog post about it.
http://www.olark.com/spw/2011/08/you-can-list-a-directory-with-8-million-files-but-not-with-ls/
Basically, readdir() only reads 32K of directory entries at a time, and so if you have a lot of files in a directory, readdir() will take a very long time to complete.
for python 2.X
import scandir
scandir.walk()
for python 3.5+
os.scandir()
https://www.python.org/dev/peps/pep-0471/
https://pypi.python.org/pypi/scandir
I found this library useful: https://github.com/benhoyt/scandir.