Direct download from Google Drive using Google Drive API
Update December 8th, 2015 According to Google Support using the
googledrive.com/host/ID
method will be turned off on Aug 31st, 2016.
I just ran into this issue.
The trick is to treat your Google Drive folder like a web host.
Update April 1st, 2015
Google Drive has changed and there's a simple way to direct link to your drive. I left my previous answers below for reference but to here's an updated answer.
- Create a Public folder in Google Drive.
- Share this drive publicly.
- Get your Folder UUID from the address bar when you're in that folder
- Put that UUID in this URL
https://googledrive.com/host/<folder UUID>/
- Add the file name to where your file is located.
https://googledrive.com/host/<folder UUID>/<file name>
Which is intended functionality by Google
new Google Drive Link.
All you have to do is simple get the host URL for a publicly shared drive folder. To do this, you can upload a plain HTML file and preview it in Google Drive to find your host URL.
Here are the steps:
- Create a folder in Google Drive.
- Share this drive publicly.
- Upload a simple HTML file. Add any additional files (subfolders ok)
- Open and "preview" the HTML file in Google Drive
- Get the URL address for this folder
- Create a direct link URL from your URL folder base
- This URL should allow direct downloads of your large files.
[edit]
I forgot to add. If you use subfolders to organize your files, you simple use the folder name as you would expect in a URL hierarchy.
https://googledrive.com/host/<your public folders id string>/images/my-image.png
What I was looking to do
I created a custom Debian image with Virtual Box for Vagrant. I wanted to share this ".box" file with colleagues so they could put the direct link into their Vagrantfile.
In the end, I needed a direct link to the actual file.
Google Drive problem
If you set the file permissions to be publicly available and create/generate a direct access link by using something like the gdocs2direct tool or just crafting the link yourself:
https://docs.google.com/uc?export=download&id=<your file id>
You will get a cookie based verification code and prompt "Google could not scan this file" prompt, which won't work for things such as wget or Vagrantfile configs.
The code that it generates is a simple code that appends GET query variable ...&confirm=###
to the string, but it's per user specific, so it's not like you can copy/paste that query variable for others.
But if you use the above "Web page hosting" method, you can get around that prompt.
I hope that helps!
If you face the "This file cannot be checked for viruses" intermezzo page, the download is not that easy.
You essentially need to first download the normal download link, which however redirects you to the "Download anyway" page. You need to store cookies from this first request, find out the link pointed to by the "Download anyway" button, and then use this link to download the file, but reusing the cookies you got from the first request.
Here's a bash variant of the download process using CURL:
curl -c /tmp/cookies "https://drive.google.com/uc?export=download&id=DOCUMENT_ID" > /tmp/intermezzo.html
curl -L -b /tmp/cookies "https://drive.google.com$(cat /tmp/intermezzo.html | grep -Po 'uc-download-link" [^>]* href="\K[^"]*' | sed 's/\&/\&/g')" > FINAL_DOWNLOADED_FILENAME
Notes:
- this procedure will probably stop working after some Google changes
- the grep command uses Perl syntax (
-P
) and the\K
"operator" which essentially means "do not include anything preceding\K
to the matched result. I don't know which version of grep introduced these options, but ancient or non-Ubuntu versions probably don't have it - a Java solution would be more or less the same, just take a HTTPS library which can handle cookies, and some nice text-parsing library
#Case 1: download file with small size.
- You can use url with format https://drive.google.com/uc?export=download&id=FILE_ID and then inputstream of file can be obtained directly.
#Case 2: download file with large size.
- You stuck a wall of a virus scan alert page returned. By parsing html dom element, I tried to get link with confirm code under button "Download anyway" but it didn't work. Its may required cookie or session info. enter image description here
SOLUTION:
Finally I found solution for two above cases. Just need to put
httpConnection.setDoOutput(true)
in connection step to get a Json.)]}' { "disposition":"SCAN_CLEAN", "downloadUrl":"http:www...", "fileName":"exam_list_json.txt", "scanResult":"OK", "sizeBytes":2392}
Then, you can use any Json parser to read downloadUrl, fileName and sizeBytes.
You can refer follow snippet, hope it help.
private InputStream gConnect(String remoteFile) throws IOException{ URL url = new URL(remoteFile); URLConnection connection = url.openConnection(); if(connection instanceof HttpURLConnection){ HttpURLConnection httpConnection = (HttpURLConnection) connection; connection.setAllowUserInteraction(false); httpConnection.setInstanceFollowRedirects(true); httpConnection.setRequestProperty("User-Agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows 2000)"); httpConnection.setDoOutput(true); httpConnection.setRequestMethod("GET"); httpConnection.connect(); int reqCode = httpConnection.getResponseCode(); if(reqCode == HttpURLConnection.HTTP_OK){ InputStream is = httpConnection.getInputStream(); Map<String, List<String>> map = httpConnection.getHeaderFields(); List<String> values = map.get("content-type"); if(values != null && !values.isEmpty()){ String type = values.get(0); if(type.contains("text/html")){ String cookie = httpConnection.getHeaderField("Set-Cookie"); String temp = Constants.getPath(mContext, Constants.PATH_TEMP) + "/temp.html"; if(saveGHtmlFile(is, temp)){ String href = getRealUrl(temp); if(href != null){ return parseUrl(href, cookie); } } } else if(type.contains("application/json")){ String temp = Constants.getPath(mContext, Constants.PATH_TEMP) + "/temp.txt"; if(saveGJsonFile(is, temp)){ FileDataSet data = JsonReaderHelper.readFileDataset(new File(temp)); if(data.getPath() != null){ return parseUrl(data.getPath()); } } } } return is; } } return null; }
And
public static FileDataSet readFileDataset(File file) throws IOException{
FileInputStream is = new FileInputStream(file);
JsonReader reader = new JsonReader(new InputStreamReader(is, "UTF-8"));
reader.beginObject();
FileDataSet rs = new FileDataSet();
while(reader.hasNext()){
String name = reader.nextName();
if(name.equals("downloadUrl")){
rs.setPath(reader.nextString());
} else if(name.equals("fileName")){
rs.setName(reader.nextString());
} else if(name.equals("sizeBytes")){
rs.setSize(reader.nextLong());
} else {
reader.skipValue();
}
}
reader.endObject();
return rs;
}
I know this is an old question but I could not find a solution to this problem after some research, so I am sharing what worked for me.
I have written this C# code for one of my projects. It can bypass the scan virus warning programmatically. The code can probably be converted to Java.
using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.IO;
using System.Net;
using System.Text;
public class FileDownloader : IDisposable
{
private const string GOOGLE_DRIVE_DOMAIN = "drive.google.com";
private const string GOOGLE_DRIVE_DOMAIN2 = "https://drive.google.com";
// In the worst case, it is necessary to send 3 download requests to the Drive address
// 1. an NID cookie is returned instead of a download_warning cookie
// 2. download_warning cookie returned
// 3. the actual file is downloaded
private const int GOOGLE_DRIVE_MAX_DOWNLOAD_ATTEMPT = 3;
public delegate void DownloadProgressChangedEventHandler( object sender, DownloadProgress progress );
// Custom download progress reporting (needed for Google Drive)
public class DownloadProgress
{
public long BytesReceived, TotalBytesToReceive;
public object UserState;
public int ProgressPercentage
{
get
{
if( TotalBytesToReceive > 0L )
return (int) ( ( (double) BytesReceived / TotalBytesToReceive ) * 100 );
return 0;
}
}
}
// Web client that preserves cookies (needed for Google Drive)
private class CookieAwareWebClient : WebClient
{
private class CookieContainer
{
private readonly Dictionary<string, string> cookies = new Dictionary<string, string>();
public string this[Uri address]
{
get
{
string cookie;
if( cookies.TryGetValue( address.Host, out cookie ) )
return cookie;
return null;
}
set
{
cookies[address.Host] = value;
}
}
}
private readonly CookieContainer cookies = new CookieContainer();
public DownloadProgress ContentRangeTarget;
protected override WebRequest GetWebRequest( Uri address )
{
WebRequest request = base.GetWebRequest( address );
if( request is HttpWebRequest )
{
string cookie = cookies[address];
if( cookie != null )
( (HttpWebRequest) request ).Headers.Set( "cookie", cookie );
if( ContentRangeTarget != null )
( (HttpWebRequest) request ).AddRange( 0 );
}
return request;
}
protected override WebResponse GetWebResponse( WebRequest request, IAsyncResult result )
{
return ProcessResponse( base.GetWebResponse( request, result ) );
}
protected override WebResponse GetWebResponse( WebRequest request )
{
return ProcessResponse( base.GetWebResponse( request ) );
}
private WebResponse ProcessResponse( WebResponse response )
{
string[] cookies = response.Headers.GetValues( "Set-Cookie" );
if( cookies != null && cookies.Length > 0 )
{
int length = 0;
for( int i = 0; i < cookies.Length; i++ )
length += cookies[i].Length;
StringBuilder cookie = new StringBuilder( length );
for( int i = 0; i < cookies.Length; i++ )
cookie.Append( cookies[i] );
this.cookies[response.ResponseUri] = cookie.ToString();
}
if( ContentRangeTarget != null )
{
string[] rangeLengthHeader = response.Headers.GetValues( "Content-Range" );
if( rangeLengthHeader != null && rangeLengthHeader.Length > 0 )
{
int splitIndex = rangeLengthHeader[0].LastIndexOf( '/' );
if( splitIndex >= 0 && splitIndex < rangeLengthHeader[0].Length - 1 )
{
long length;
if( long.TryParse( rangeLengthHeader[0].Substring( splitIndex + 1 ), out length ) )
ContentRangeTarget.TotalBytesToReceive = length;
}
}
}
return response;
}
}
private readonly CookieAwareWebClient webClient;
private readonly DownloadProgress downloadProgress;
private Uri downloadAddress;
private string downloadPath;
private bool asyncDownload;
private object userToken;
private bool downloadingDriveFile;
private int driveDownloadAttempt;
public event DownloadProgressChangedEventHandler DownloadProgressChanged;
public event AsyncCompletedEventHandler DownloadFileCompleted;
public FileDownloader()
{
webClient = new CookieAwareWebClient();
webClient.DownloadProgressChanged += DownloadProgressChangedCallback;
webClient.DownloadFileCompleted += DownloadFileCompletedCallback;
downloadProgress = new DownloadProgress();
}
public void DownloadFile( string address, string fileName )
{
DownloadFile( address, fileName, false, null );
}
public void DownloadFileAsync( string address, string fileName, object userToken = null )
{
DownloadFile( address, fileName, true, userToken );
}
private void DownloadFile( string address, string fileName, bool asyncDownload, object userToken )
{
downloadingDriveFile = address.StartsWith( GOOGLE_DRIVE_DOMAIN ) || address.StartsWith( GOOGLE_DRIVE_DOMAIN2 );
if( downloadingDriveFile )
{
address = GetGoogleDriveDownloadAddress( address );
driveDownloadAttempt = 1;
webClient.ContentRangeTarget = downloadProgress;
}
else
webClient.ContentRangeTarget = null;
downloadAddress = new Uri( address );
downloadPath = fileName;
downloadProgress.TotalBytesToReceive = -1L;
downloadProgress.UserState = userToken;
this.asyncDownload = asyncDownload;
this.userToken = userToken;
DownloadFileInternal();
}
private void DownloadFileInternal()
{
if( !asyncDownload )
{
webClient.DownloadFile( downloadAddress, downloadPath );
// This callback isn't triggered for synchronous downloads, manually trigger it
DownloadFileCompletedCallback( webClient, new AsyncCompletedEventArgs( null, false, null ) );
}
else if( userToken == null )
webClient.DownloadFileAsync( downloadAddress, downloadPath );
else
webClient.DownloadFileAsync( downloadAddress, downloadPath, userToken );
}
private void DownloadProgressChangedCallback( object sender, DownloadProgressChangedEventArgs e )
{
if( DownloadProgressChanged != null )
{
downloadProgress.BytesReceived = e.BytesReceived;
if( e.TotalBytesToReceive > 0L )
downloadProgress.TotalBytesToReceive = e.TotalBytesToReceive;
DownloadProgressChanged( this, downloadProgress );
}
}
private void DownloadFileCompletedCallback( object sender, AsyncCompletedEventArgs e )
{
if( !downloadingDriveFile )
{
if( DownloadFileCompleted != null )
DownloadFileCompleted( this, e );
}
else
{
if( driveDownloadAttempt < GOOGLE_DRIVE_MAX_DOWNLOAD_ATTEMPT && !ProcessDriveDownload() )
{
// Try downloading the Drive file again
driveDownloadAttempt++;
DownloadFileInternal();
}
else if( DownloadFileCompleted != null )
DownloadFileCompleted( this, e );
}
}
// Downloading large files from Google Drive prompts a warning screen and requires manual confirmation
// Consider that case and try to confirm the download automatically if warning prompt occurs
// Returns true, if no more download requests are necessary
private bool ProcessDriveDownload()
{
FileInfo downloadedFile = new FileInfo( downloadPath );
if( downloadedFile == null )
return true;
// Confirmation page is around 50KB, shouldn't be larger than 60KB
if( downloadedFile.Length > 60000L )
return true;
// Downloaded file might be the confirmation page, check it
string content;
using( var reader = downloadedFile.OpenText() )
{
// Confirmation page starts with <!DOCTYPE html>, which can be preceeded by a newline
char[] header = new char[20];
int readCount = reader.ReadBlock( header, 0, 20 );
if( readCount < 20 || !( new string( header ).Contains( "<!DOCTYPE html>" ) ) )
return true;
content = reader.ReadToEnd();
}
int linkIndex = content.LastIndexOf( "href=\"/uc?" );
if( linkIndex < 0 )
return true;
linkIndex += 6;
int linkEnd = content.IndexOf( '"', linkIndex );
if( linkEnd < 0 )
return true;
downloadAddress = new Uri( "https://drive.google.com" + content.Substring( linkIndex, linkEnd - linkIndex ).Replace( "&", "&" ) );
return false;
}
// Handles the following formats (links can be preceeded by https://):
// - drive.google.com/open?id=FILEID
// - drive.google.com/file/d/FILEID/view?usp=sharing
// - drive.google.com/uc?id=FILEID&export=download
private string GetGoogleDriveDownloadAddress( string address )
{
int index = address.IndexOf( "id=" );
int closingIndex;
if( index > 0 )
{
index += 3;
closingIndex = address.IndexOf( '&', index );
if( closingIndex < 0 )
closingIndex = address.Length;
}
else
{
index = address.IndexOf( "file/d/" );
if( index < 0 ) // address is not in any of the supported forms
return string.Empty;
index += 7;
closingIndex = address.IndexOf( '/', index );
if( closingIndex < 0 )
{
closingIndex = address.IndexOf( '?', index );
if( closingIndex < 0 )
closingIndex = address.Length;
}
}
return string.Concat( "https://drive.google.com/uc?id=", address.Substring( index, closingIndex - index ), "&export=download" );
}
public void Dispose()
{
webClient.Dispose();
}
}
And here's how you can use it:
// NOTE: FileDownloader is IDisposable!
FileDownloader fileDownloader = new FileDownloader();
// This callback is triggered for DownloadFileAsync only
fileDownloader.DownloadProgressChanged += ( sender, e ) => Console.WriteLine( "Progress changed " + e.BytesReceived + " " + e.TotalBytesToReceive );
// This callback is triggered for both DownloadFile and DownloadFileAsync
fileDownloader.DownloadFileCompleted += ( sender, e ) => Console.WriteLine( "Download completed" );
fileDownloader.DownloadFileAsync( "https://INSERT_DOWNLOAD_LINK_HERE", @"C:\downloadedFile.txt" );