Stream file from Google Cloud Storage
Just to clarify, do you need an OutputStream
or an InputStream
? One way to look at this is that the data stored in Google Cloud Storage object as a file and you having an InputStream to read that file. If that works, read on.
There is no existing method in Storage API which provides an InputStream
or an OutputStream
. But the there are 2 APIs in the Cloud Storage client library which expose a ReadChannel
object which is extended from ReadableByteChannel
(from java NIO API).
ReadChannel reader(String bucket, String blob, BlobSourceOption... options);
ReadChannel reader(BlobId blob, BlobSourceOption... options);
A simple example using this (taken from StorageSnippets.java):
/**
* Example of reading a blob's content through a reader.
*/
// [TARGET reader(String, String, BlobSourceOption...)]
// [VARIABLE "my_unique_bucket"]
// [VARIABLE "my_blob_name"]
public void readerFromStrings(String bucketName, String blobName) throws IOException {
// [START readerFromStrings]
try (ReadChannel reader = storage.reader(bucketName, blobName)) {
ByteBuffer bytes = ByteBuffer.allocate(64 * 1024);
while (reader.read(bytes) > 0) {
bytes.flip();
// do something with bytes
bytes.clear();
}
}
// [END readerFromStrings]
}
You can also use the newInputStream()
method to wrap an InputStream
over the ReadableByteChannel
.
public static InputStream newInputStream(ReadableByteChannel ch)
Even if you need an OutputStream
, you should be able to copy data from the InputStream
or better from the ReadChannel
object into the OutputStream
.
Complete example
Run this example as: PROGRAM_NAME <BUCKET_NAME> <BLOB_PATH>
import java.io.IOException;
import java.nio.ByteBuffer;
import java.nio.channels.Channels;
import java.nio.channels.WritableByteChannel;
import com.google.cloud.ReadChannel;
import com.google.cloud.storage.Bucket;
import com.google.cloud.storage.BucketInfo;
import com.google.cloud.storage.Storage;
import com.google.cloud.storage.StorageOptions;
/**
* An example which reads the contents of the specified object/blob from GCS
* and prints the contents to STDOUT.
*
* Run it as PROGRAM_NAME <BUCKET_NAME> <BLOB_PATH>
*/
public class ReadObjectSample {
private static final int BUFFER_SIZE = 64 * 1024;
public static void main(String[] args) throws IOException {
// Instantiates a Storage client
Storage storage = StorageOptions.getDefaultInstance().getService();
// The name for the GCS bucket
String bucketName = args[0];
// The path of the blob (i.e. GCS object) within the GCS bucket.
String blobPath = args[1];
printBlob(storage, bucketName, blobPath);
}
// Reads from the specified blob present in the GCS bucket and prints the contents to STDOUT.
private static void printBlob(Storage storage, String bucketName, String blobPath) throws IOException {
try (ReadChannel reader = storage.reader(bucketName, blobPath)) {
WritableByteChannel outChannel = Channels.newChannel(System.out);
ByteBuffer bytes = ByteBuffer.allocate(BUFFER_SIZE);
while (reader.read(bytes) > 0) {
bytes.flip();
outChannel.write(bytes);
bytes.clear();
}
}
}
}
Currently the cleanest option I could find looks like this:
Blob blob = bucket.get("some-file");
ReadChannel reader = blob.reader();
InputStream inputStream = Channels.newInputStream(reader);
The Channels is from java.nio. Furthermore you can then use commons io to easily read to InputStream into an OutputStream:
IOUtils.copy(inputStream, outputStream);