how to read tar.gz file in python code example

Example 1: tar.gz files

tar -czvf name-of-archive.tar.gz /path/to/directory-or-file

Example 2: python extract tar file

import tarfile

#simple function to extract the train data
#tar_file : the path to the .tar file
#path : the path where it will be extracted
def extract(tar_file, path):
    opened_tar = tarfile.open(tar_file)
     
    if tarfile.is_tarfile(tar_file):
        opened_tar.extractall(path)
    else:
        print("The tar file you entered is not a tar file")
extract('/kaggle/input/gnr-638/train.tar.xz', '/kaggle/working/gnr-638')
extract('/kaggle/input/gnr-638/test_set.tar.xz', '/kaggle/working/gnr-638')

Example 3: read tar.gz file python

The docs tell us that None is returned by extractfile() if the member is a not a regular file or link.

One possible solution is to skip over the None results:

tar = tarfile.open("filename.tar.gz", "r:gz")
for member in tar.getmembers():
     f = tar.extractfile(member)
     if f is not None:
         content = f.read()