Google Cloud Storage + Python : Any way to list obj in certain folder in GCS?
This worked for me:
client = storage.Client()
BUCKET_NAME = 'DEMO_BUCKET'
bucket = client.get_bucket(BUCKET_NAME)
blobs = bucket.list_blobs()
for blob in blobs:
print(blob.name)
The list_blobs() method will return an iterator used to find blobs in the bucket. Now you can iterate over blobs and access every object in the bucket. In this example I just print out the name of the object.
This documentation helped me alot:
https://googleapis.github.io/google-cloud-python/latest/storage/blobs.html
https://googleapis.github.io/google-cloud-python/latest/_modules/google/cloud/storage/client.html#Client.bucket
I hope I could help!
Update: the below is true for the older "Google API Client Libraries" for Python, but if you're not using that client, prefer the newer "Google Cloud Client Library" for Python ( https://googleapis.dev/python/storage/latest/index.html ). For the newer library, the equivalent to the below code is:
from google.cloud import storage
client = storage.Client()
for blob in client.list_blobs('bucketname', prefix='abc/myfolder'):
print(str(blob))
Answer for older client follows.
You may find it easier to work with the JSON API, which has a full-featured Python client. It has a function for listing objects that takes a prefix parameter, which you could use to check for a certain directory and its children in this manner:
from apiclient import discovery
# Auth goes here if necessary. Create authorized http object...
client = discovery.build('storage', 'v1') # add http=whatever param if auth
request = client.objects().list(
bucket="mybucket",
prefix="abc/myfolder")
while request is not None:
response = request.execute()
print json.dumps(response, indent=2)
request = request.list_next(request, response)
Fuller documentation of the list call is here: https://developers.google.com/storage/docs/json_api/v1/objects/list
And the Google Python API client is documented here: https://code.google.com/p/google-api-python-client/