How do I traverse a hdf5 file using h5py

This is a pretty old thread, but I found a solution to basically replicating the h5ls command in Python:

class H5ls:
    def __init__(self):
        # Store an empty list for dataset names
        self.names = []

    def __call__(self, name, h5obj):
        # only h5py datasets have dtype attribute, so we can search on this
        if hasattr(h5obj,'dtype') and not name in self.names:
            self.names += [names]


        # we have no return so that the visit function is recursive

if __name__ == "__main__":
    df = h5py.File(filename,'r')
    h5ls = H5ls()
    # this will now visit all objects inside the hdf5 file and store datasets in h5ls.names
    df.visititems(h5ls) 

    df.close() 

This code will iterate through the whole HDF5 file, and store all the datasets in h5ls.names, hope this helps!


Well this is kind of an old thread but I thought I'd contribute anyway. This is what I did in a similar situation. For a data structure set up like this:

[group1]
    [group2]
        dataset1
        dataset2
    [group3]
        dataset3
        dataset4

I used:

datalist = []
def returnname(name):
    if 'dataset' in name and name not in datalist:
        return name
    else:
        return None
looper = 1
while looper == 1:
    name = f[group1].visit(returnname)
    if name == None:
        looper = 0
        continue
    datalist.append(name)

I haven't found an h5py equivalent for os.walk.


visit() and visititems() are your friends here. Cf. http://docs.h5py.org/en/latest/high/group.html#Group.visit. Note that an h5py.File is also an h5py.Group. Example (not tested):

def visitor_func(name, node):
    if isinstance(node, h5py.Dataset):
         # node is a dataset
    else:
         # node is a group

with h5py.File('myfile.h5', 'r') as f:
    f.visititems(visitor_func)

Tags:

H5Py