有没有一种简单的方法来请求在Python中的URL,而不是按照redirect?

看看urllib2的源代码,看起来最简单的方法是将子类HTTPRedirectHandler,然后使用build_opener来覆盖默认的HTTPRedirectHandler,但是这看起来像很多(相对复杂)的工作,看起来应该是什么很简单。

请求方式如下:

import requests r = requests.get('http://github.com', allow_redirects=False) print(r.status_code, r.headers['Location']) 

深入Python对于使用urllib2处理redirect有很好的一章。 另一个解决scheme是httplib 。

 >>> import httplib >>> conn = httplib.HTTPConnection("www.bogosoft.com") >>> conn.request("GET", "") >>> r1 = conn.getresponse() >>> print r1.status, r1.reason 301 Moved Permanently >>> print r1.getheader('Location') http://www.bogosoft.com/new/location 

这是一个不会遵循redirect的urllib2处理程序:

 class NoRedirectHandler(urllib2.HTTPRedirectHandler): def http_error_302(self, req, fp, code, msg, headers): infourl = urllib.addinfourl(fp, headers, req.get_full_url()) infourl.status = code infourl.code = code return infourl http_error_300 = http_error_302 http_error_301 = http_error_302 http_error_303 = http_error_302 http_error_307 = http_error_302 opener = urllib2.build_opener(NoRedirectHandler()) urllib2.install_opener(opener) 

我想这会有所帮助

 from httplib2 import Http def get_html(uri,num_redirections=0): # put it as 0 for not to follow redirects conn = Http() return conn.request(uri,redirections=num_redirections) 

httplib2请求方法中的redirections关键字是一个红色的鲱鱼。 如果它收到redirect状态码,则不会返回第一个请求,而是引发RedirectLimitexception。 要返回初始响应,您需要在Http对象follow_redirects设置为False

 import httplib2 h = httplib2.Http() h.follow_redirects = False (response, body) = h.request("http://example.com") 

我第二个olt指向潜入Python 。 这里有一个使用urllib2redirect处理程序的实现,比它应该更多的工作? 也许,耸耸肩。

 import sys import urllib2 class RedirectHandler(urllib2.HTTPRedirectHandler): def http_error_301(self, req, fp, code, msg, headers): result = urllib2.HTTPRedirectHandler.http_error_301( self, req, fp, code, msg, headers) result.status = code raise Exception("Permanent Redirect: %s" % 301) def http_error_302(self, req, fp, code, msg, headers): result = urllib2.HTTPRedirectHandler.http_error_302( self, req, fp, code, msg, headers) result.status = code raise Exception("Temporary Redirect: %s" % 302) def main(script_name, url): opener = urllib2.build_opener(RedirectHandler) urllib2.install_opener(opener) print urllib2.urlopen(url).read() if __name__ == "__main__": main(*sys.argv) 

最短的方法是

 class NoRedirect(urllib2.HTTPRedirectHandler): def redirect_request(self, req, fp, code, msg, hdrs, newurl): pass noredir_opener = urllib2.build_opener(NoRedirect())