从谷歌驱动器wget /curl大文件

我试图从脚本中的谷歌驱动器下载一个文件，这样做有点麻烦。我试图下载的文件在这里。

我已经在网上广泛查看，我终于设法让其中一个下载。我得到的文件的UID和较小的（1.6MB）下载罚款，但较大的文件（3.7GB）总是redirect到一个页面，问我是否要进行下载没有病毒扫描。有人能帮我渡过那个屏幕吗？

以下是我如何得到第一个文件的工作 –

curl -L "https://docs.google.com/uc?export=download&id=0Bz-w5tutuZIYeDU0VDRFWG9IVUE" > phlat-1.0.tar.gz

当我在其他文件上运行相同的时候，

 curl -L "https://docs.google.com/uc?export=download&id=0Bz-w5tutuZIYY3h5YlMzTjhnbGM" > index4phlat.tar.gz

我得到以下输出 – 在这里输入图像说明

我注意到在链接的倒数第三行，那里有一个&confirm=JwkK这是一个随机的4个字符的string，但build议有一种方法来添加一个确认到我的url。我访问的其中一个链接build议&confirm=no_antivirus但这是行不通的。

我希望这里有人可以帮助这个！

提前致谢。

看看这个问题：使用Google Drive API直接从Google云端硬盘下载

基本上你必须创build一个公共目录并通过相关的参考来访问你的文件

 wget https://googledrive.com/host/LARGEPUBLICFOLDERID/index4phlat.tar.gz

警告：此function已弃用。在下面的评论中看到警告。

或者，您可以使用以下脚本： https ： //gitlab.com/Nanolx/patchimage/blob/master/tools/gdown.pl

我编写了一个Python片段，用于从Google Drive下载文件，并给出可共享的链接 。它工作， 截至2017年8月 。

剪切的不使用gdrive ，也不使用Google Drive API。它使用请求模块。

从Google云端硬盘下载大文件时，只有一个GET请求是不够的。第二个是需要的，这个有一个额外的URL参数叫做confirm ，它的值应该等于某个cookie的值。

 import requests def download_file_from_google_drive(id, destination): def get_confirm_token(response): for key, value in response.cookies.items(): if key.startswith('download_warning'): return value return None def save_response_content(response, destination): CHUNK_SIZE = 32768 with open(destination, "wb") as f: for chunk in response.iter_content(CHUNK_SIZE): if chunk: # filter out keep-alive new chunks f.write(chunk) URL = "https://docs.google.com/uc?export=download" session = requests.Session() response = session.get(URL, params = { 'id' : id }, stream = True) token = get_confirm_token(response) if token: params = { 'id' : id, 'confirm' : token } response = session.get(URL, params = params, stream = True) save_response_content(response, destination) if __name__ == "__main__": import sys if len(sys.argv) is not 3: print "Usage: python google_drive.py drive_file_id destination_file_path" else: # TAKE ID FROM SHAREABLE LINK file_id = sys.argv[1] # DESTINATION FILE ON YOUR DISK destination = sys.argv[2] download_file_from_google_drive(file_id, destination)

您可以使用开源的Linux / Unix命令行工具gdrive 。

要安装它：

下载二进制文件。 select一个适合你的架构，例如gdrive-linux-x64 。

将其复制到您的path。

 sudo cp gdrive-linux-x64 /usr/local/bin/gdrive; sudo chmod a+x /usr/local/bin/gdrive;

要使用它：

确定Google Drive文件ID。 为此，请右键点击Google云端硬盘网站上的所需文件，然后select“获取链接…”。它会返回类似https://drive.google.com/open?id=0B7_OwkDsUIgFWXA1B2FPQfV5S8H 。获取?id=后面的string并将其复制到剪贴板。这是文件的ID。
下载文件。 当然，在下面的命令中使用你的文件的ID。
```
 gdrive download 0B7_OwkDsUIgFWXA1B2FPQfV5S8H 
```

首次使用时，该工具需要获取Google Drive API的访问权限。为此，它将向您显示您必须在浏览器中访问的链接，然后您将获得validation码以复制并粘贴回工具。下载然后自动启动。没有进度指示器，但可以在文件pipe理器或第二个terminal中观察进度。

来源： Tobi对另一个答案的评论。

谷歌驱动器的默认行为是扫描文件的病毒，如果文件大，它会提示用户，并通知他，该文件无法扫描。

目前唯一的解决办法是与networking共享文件并创build一个networking资源。

从谷歌驱动器帮助页面引用：

借助云端硬盘，您可以将networking资源（如HTML，CSS和Javascript文件）视为网站。

使用云端硬盘托pipe网页：

在drive.google.com打开云端硬盘并select一个文件。

点击页面顶部的分享button。

点击共享框右下angular的高级。

点击更改….

在Web上select开 – 公共，然后单击保存。

在closures共享框之前，从“链接到共享”下面的字段中复制URL中的文档ID。文档ID是URL中斜杠之间的大小写字母和数字的string。

分享看起来像“www.googledrive.com/host/[doc id]”的url，其中[doc id]由您在步骤6中复制的文档IDreplace。
任何人现在都可以查看您的网页。

在此处find： https ： //support.google.com/drive/answer/2881970？hl = zh_CN

所以，例如，当你在谷歌驱动器上公开共享文件的共享链接如下所示：

 https://drive.google.com/file/d/0B5IRsLTwEO6CVXFURmpQZ1Jxc0U/view?usp=sharing

然后复制文件ID并创build一个如下所示的googledrive.com连接：

 https://www.googledrive.com/host/0B5IRsLTwEO6CVXFURmpQZ1Jxc0U

 ggID='put_googleID_here' ggURL='https://drive.google.com/uc?export=download' filename="$(curl -sc /tmp/gcokie "${ggURL}&id=${ggID}" | grep -o '="uc-name.*</span>' | sed 's/.*">//;s/<.a> .*//')" getcode="$(awk '/_warning_/ {print $NF}' /tmp/gcokie)" curl -Lb /tmp/gcokie "${ggURL}&confirm=${getcode}&id=${ggID}" -o "${filename}"

它是如何工作的？
用curl获取cookie文件和html代码。
pipe道html到grep和sed并search文件名。
用awk从cookie文件获取确认代码。
最后下载启用cookie的文件，确认代码和文件名。

 curl -Lb /tmp/gcokie "https://drive.google.com/uc?export=download&confirm=Uq6r&id=0B5IRsLTwEO6CVXFURmpQZ1Jxc0U" -o "SomeBigFile.zip"

如果你不需要文件名variablescurl可以猜测它
-L遵循redirect
-O远程名称
-J远程报头名称

 curl -sc /tmp/gcokie "${ggURL}&id=${ggID}" >/dev/null getcode="$(awk '/_warning_/ {print $NF}' /tmp/gcokie)" curl -LOJb /tmp/gcokie "${ggURL}&confirm=${getcode}&id=${ggID}"

要从URL中提取Google文件ID，您可以使用：

 echo "gURL" | egrep -o '(\w|-){26,}' # match more than 26 word characters

要么

 echo "gURL" | sed 's/[^A-Za-z0-9_-]/\n/g' | sed -rn '/.{26}/p' # replace non-word characters with new line, # print only line with more than 26 word characters

这是一个快速的方法来做到这一点。

确保链接是共享的，它看起来像这样：

https://drive.google.com/open?id=FILEID&authuser=0

然后，复制该FILEID并像这样使用它

 wget --no-check-certificate 'https://docs.google.com/uc?export=download&id=FILEID' -O FILENAME

截至2016年12月，没有答案提出对我有用（来源）：

 curl -L https://drive.google.com/uc?id={FileID}

前提是已将Google云端硬盘文件与具有链接的文件共享，并且{FileID}是共享url后面的?id=后面的string。

虽然我没有检查大文件，但我相信这可能是有用的知道。

在Go：驱动器中有一个开源的多平台客户端。这是相当不错，function齐全，也在积极发展。

 $ drive help pull Name pull - pulls remote changes from Google Drive Description Downloads content from the remote drive or modifies local content to match that on your Google Drive Note: You can skip checksum verification by passing in flag `-ignore-checksum` * For usage flags: `drive pull -h`

我无法得到Nanoix的perl脚本，或者我曾经见过的其他curl示例，所以我开始用python自己研究api。这对小文件工作正常，但大文件阻塞过去可用的RAM，所以我发现了一些其他不错的块使用API的部分下载的能力的代码。请点击这里： https ： //gist.github.com/csik/c4c90987224150e4a0b2

请注意关于从API接口下载client_secret json文件到本地目录的一些信息。

资源

 $ cat gdrive_dl.py from pydrive.auth import GoogleAuth from pydrive.drive import GoogleDrive """API calls to download a very large google drive file. The drive API only allows downloading to ram (unlike, say, the Requests library's streaming option) so the files has to be partially downloaded and chunked. Authentication requires a google api key, and a local download of client_secrets.json Thanks to Radek for the key functions: http://stackoverflow.com/questions/27617258/memoryerror-how-to-download-large-file-via-google-drive-sdk-using-python """ def partial(total_byte_len, part_size_limit): s = [] for p in range(0, total_byte_len, part_size_limit): last = min(total_byte_len - 1, p + part_size_limit - 1) s.append([p, last]) return s def GD_download_file(service, file_id): drive_file = service.files().get(fileId=file_id).execute() download_url = drive_file.get('downloadUrl') total_size = int(drive_file.get('fileSize')) s = partial(total_size, 100000000) # I'm downloading BIG files, so 100M chunk size is fine for me title = drive_file.get('title') originalFilename = drive_file.get('originalFilename') filename = './' + originalFilename if download_url: with open(filename, 'wb') as file: print "Bytes downloaded: " for bytes in s: headers = {"Range" : 'bytes=%s-%s' % (bytes[0], bytes[1])} resp, content = service._http.request(download_url, headers=headers) if resp.status == 206 : file.write(content) file.flush() else: print 'An error occurred: %s' % resp return None print str(bytes[1])+"..." return title, filename else: return None gauth = GoogleAuth() gauth.CommandLineAuth() #requires cut and paste from a browser FILE_ID = 'SOMEID' #FileID is the simple file hash, like 0B1NzlxZ5RpdKS0NOS0x0Ym9kR0U drive = GoogleDrive(gauth) service = gauth.service #file = drive.CreateFile({'id':FILE_ID}) # Use this to get file metadata GD_download_file(service, FILE_ID)

这是我写的一个小小的bash脚本，今天做这个工作。它在大文件上工作，也可以恢复部分获取的文件。它有两个参数，第一个是file_id，第二个是输出文件的名称。这里比以前的答案的主要改进是，它可以在大文件上工作，只需要常用工具：bash，curl，tr，grep，du，cut和mv。

 #!/usr/bin/env bash fileid="$1" destination="$2" # try to download the file curl -c /tmp/cookie -L -o /tmp/probe.bin "https://drive.google.com/uc?export=download&id=${fileid}" probeSize=`du -b /tmp/probe.bin | cut -f1` # did we get a virus message? # this will be the first line we get when trying to retrive a large file bigFileSig='<!DOCTYPE html><html><head><title>Google Drive - Virus scan warning</title><meta http-equiv="content-type" content="text/html; charset=utf-8"/>' sigSize=${#bigFileSig} if (( probeSize <= sigSize )); then virusMessage=false else firstBytes=$(head -c $sigSize /tmp/probe.bin) if [ "$firstBytes" = "$bigFileSig" ]; then virusMessage=true else virusMessage=false fi fi if [ "$virusMessage" = true ] ; then confirm=$(tr ';' '\n' </tmp/probe.bin | grep confirm) confirm=${confirm:8:4} curl -C - -b /tmp/cookie -L -o "$destination" "https://drive.google.com/uc?export=download&id=${fileid}&confirm=${confirm}" else mv /tmp/probe.bin "$destination" fi

下面是解决方法，我从Google Drive下载文件到我的Google Cloud Linuxshell。

使用高级共享将文件共享到PUBLIC和使用“编辑”权限。
你会得到一个共享链接，这将有一个ID。请参阅以下链接： – drive.google.com/file/d/[ID]/view?usp=sharing
复制该ID并将其粘贴到以下链接中： –

googledrive.com/host/[ID]

上面的链接将是我们的下载链接。
使用wget下载文件： –

wget https://googledrive.com/host/%5BID%5D

该命令将下载名称为[ID]的文件，但没有扩展名，但在运行wget命令的同一位置具有相同的文件大小。
实际上，我在练习中下载了一个压缩文件夹。所以我重命名那个尴尬的文件使用： –

mv [ID] 1.zip

然后使用

解压缩1.zip

我们会得到这些文件。

简单的方法：

^{（如果你只是需要一次性下载）}

转到具有下载链接的Google云端硬盘网页
打开浏览器控制台并转到“networking”选项卡
点击下载链接
等待文件开始下载，并find相应的请求（应该是列表中的最后一个），那么你可以取消下载
右键单击请求并点击“Copy as cURL”（或类似）

你最终应该是这样的：

 curl 'https://doc-0s-80-docs.googleusercontent.com/docs/securesc/aa51s66fhf9273i....................blah blah blah...............gEIqZ3KAQ==' --compressed

在你的控制台中过去，把> my-file-name.extension添加到最后（否则它会把文件写入你的控制台），然后按回车:)

我与Google云端硬盘有同样的问题。

以下是我如何使用链接2解决问题。

在PC上打开浏览器，导航到Google云端硬盘中的文件。给你的文件一个公共的链接。
将公共链接复制到剪贴板（例如，右键单击，复制链接地址）
打开一个terminal。如果你正在下载到另一台PC /服务器/机器，你应该通过SSH连接到这一点
安装链接2（Debian / Ubuntu的方法，使用您的发行版或操作系统等效）

sudo apt-get install links2
将链接粘贴到您的terminal，并使用链接打开它，如下所示：

links2 "paste url here"
使用箭头键导航到链接内的下载链接，然后按Enter键
select一个文件名，它会下载你的文件

这工作到2017年11月https://gist.github.com/ppetraki/258ea8240041e19ab258a736781f06db

 #!/bin/bash SOURCE="$1" if [ "${SOURCE}" == "" ]; then echo "Must specify a source url" exit 1 fi DEST="$2" if [ "${DEST}" == "" ]; then echo "Must specify a destination filename" exit 1 fi FILEID=$(echo $SOURCE | rev | cut -d= -f1 | rev) COOKIES=$(mktemp) CODE=$(wget --save-cookies $COOKIES --keep-session-cookies --no-check-certificate "https://docs.google.com/uc?export=download&id=${FILEID}" -O- | sed -rn 's/.*confirm=([0-9A-Za-z_]+).*/Code: \1\n/p') # cleanup the code, format is 'Code: XXXX' CODE=$(echo $CODE | rev | cut -d: -f1 | rev | xargs) wget --load-cookies $COOKIES "https://docs.google.com/uc?export=download&confirm=${CODE}&id=${FILEID}" -O $DEST rm -f $COOKIES

最简单的方法是把你想要下载到一个文件夹中。分享该文件夹，然后从URL栏中获取文件夹ID。

然后转到https://googledrive.com/host/%5B ID ] （用您的文件夹IDreplaceID）您应该看到该文件夹中所有文件的列表，单击您要下载的文件。然后下载应该访问你的下载页面（在Chrome上按Ctrl + J），然后你想复制下载链接，然后使用wget“下载链接”

请享用：）

从谷歌驱动器wget /curl大文件

简单的方法：

HTTPS和SSL3_GET_SERVER_CERTIFICATE：证书validation失败，CA正常

如何让file_get_contents（）与HTTPS一起使用？

批处理脚本获取HTML网站和parsing内容（没有wget，curl或其他外部应用程序）

如何使用wget / curl下载指定网页上的.zip文件的所有链接？

使用PHP创build一个REST API

curl错误：接收失败：由对等重置的连接 – PHP Curl

最好的方法来pipe理长时间运行的PHP脚本？

使用curl下载大文件

如何启用curl，安装Ubuntu LAMP堆栈？

从外部网站获取标题和元标记