从Python 3的网站下载文件

我正在创build一个程序,通过读取同一游戏/应用程序的.jad文件中指定的URL,从Web服务器下载.jar(java)文件。 我正在使用Python 3.2.1

我已经设法从JAD文件中提取JAR文件的URL(每个JAD文件都包含JAR文件的URL),但正如您可能想象的那样,提取的值是type()string。

这是相关的function:

def downloadFile(URL=None): import httplib2 h = httplib2.Http(".cache") resp, content = h.request(URL, "GET") return content downloadFile(URL_from_file) 

但是,我总是得到一个错误,说上面的函数中的types必须是字节,而不是string。 我尝试过使用URL.encode('utf-8')以及字节(URL,encoding ='utf-8'),但我总是会得到相同或相似的错误。

所以基本上我的问题是如何从一个服务器下载URL时存储在一个stringtypes的文件?

如果你想获得一个网页的内容到一个variables,只要read urllib.request.urlopen的响应:

 import urllib.request ... url = 'http://example.com/' response = urllib.request.urlopen(url) data = response.read() # a `bytes` object text = data.decode('utf-8') # a `str`; this step can't be used if data is binary 

下载和保存文件的最简单的方法是使用urllib.request.urlretrieve函数:

 import urllib.request ... # Download the file from `url` and save it locally under `file_name`: urllib.request.urlretrieve(url, file_name) 
 import urllib.request ... # Download the file from `url`, save it in a temporary directory and get the # path to it (eg '/tmp/tmpb48zma.txt') in the `file_name` variable: file_name, headers = urllib.request.urlretrieve(url) 

但请记住, urlretrieve被认为是遗留的 ,可能会被弃用(不知道为什么,虽然)。

因此,最正确的方法是使用urllib.request.urlopen函数返回一个表示HTTP响应的文件类对象,并使用shutil.copyfileobj将其复制到一个真实文件中。

 import urllib.request import shutil ... # Download the file from `url` and save it locally under `file_name`: with urllib.request.urlopen(url) as response, open(file_name, 'wb') as out_file: shutil.copyfileobj(response, out_file) 

如果这看起来太复杂了,你可能想简单一些,将整个下载文件存储在一个bytes对象中,然后写入一个文件。 但是这只适用于小文件。

 import urllib.request ... # Download the file from `url` and save it locally under `file_name`: with urllib.request.urlopen(url) as response, open(file_name, 'wb') as out_file: data = response.read() # a `bytes` object out_file.write(data) 

可以在运行中提取.gz (也可能是其他格式)压缩数据,但是这样的操作可能需要HTTP服务器支持随机访问文件。

 import urllib.request import gzip ... # Read the first 64 bytes of the file inside the .gz archive located at `url` url = 'http://example.com/something.gz' with urllib.request.urlopen(url) as response: with gzip.GzipFile(fileobj=response) as uncompressed: file_header = uncompressed.read(64) # a `bytes` object # Or do anything shown above using `uncompressed` instead of `response`. 

我使用requests包,只要我想要的东西与HTTP请求有关,因为它的API非常容易开始:

首先,安装requests

 $ pip install requests 

那么代码:

 from requests import get # to make GET request def download(url, file_name): # open in binary mode with open(file_name, "wb") as file: # get request response = get(url) # write to file file.write(response.content) 

我希望我明白这个问题的正确性,即:当URL以stringtypes存储时,如何从服务器下载文件?

我使用下面的代码下载文件并保存在本地:

 import requests url = 'static/img/python-logo.png' fileName = 'D:\Python\dwnldPythonLogo.png' req = requests.get(url) file = open(fileName, 'wb') for chunk in req.iter_content(100000): file.write(chunk) file.close() 
 from urllib import request def get(url): with request.urlopen(url) as r: return r.read() def download(url, file=None): if not file: file = url.split('/')[-1] with open(file, 'wb') as f: f.write(get(url))