HttpClient 4 – 如何捕获最后的redirecturl

我有相当简单的HttpClient 4代码调用HttpGet来获取HTML输出。 HTML返回的脚本和图像位置都设置为本地(例如<img src="http://img.dovov.comfoo.jpg"/> ),所以我需要调用URL来使这些变为绝对的( <img src="http://foo.comhttp://img.dovov.comfoo.jpg"/> )现在出现这个问题 – 在调用期间,可能会有一两个302redirect,所以原始URL不再反映HTML的位置。

给出所有可能(或不可以)redirect的返回内容的最新URL。

我看着HttpGet#getAllHeaders()HttpResponse#getAllHeaders() – 找不到任何东西。

编辑: HttpGet#getURI()返回原始的调用地址

这将是当前的URL,你可以通过调用

  HttpGet#getURI(); 

编辑:你没有提到你如何做redirect。 这对我们很有用,因为我们自己处理302。

听起来就像你使用DefaultRedirectHandler。 我们曾经这样做。 获取当前url有点棘手。 你需要使用你自己的上下文。 这里是相关的代码片段,

  HttpGet httpget = new HttpGet(url); HttpContext context = new BasicHttpContext(); HttpResponse response = httpClient.execute(httpget, context); if (response.getStatusLine().getStatusCode() != HttpStatus.SC_OK) throw new IOException(response.getStatusLine().toString()); HttpUriRequest currentReq = (HttpUriRequest) context.getAttribute( ExecutionContext.HTTP_REQUEST); HttpHost currentHost = (HttpHost) context.getAttribute( ExecutionContext.HTTP_TARGET_HOST); String currentUrl = (currentReq.getURI().isAbsolute()) ? currentReq.getURI().toString() : (currentHost.toURI() + currentReq.getURI()); 

默认的redirect没有为我们工作,所以我们改变了,但我忘了是什么问题。

在HttpClient 4中,如果您使用的是LaxRedirectStrategyDefaultRedirectStrategy任何子类,build议使用这种方法(请参阅DefaultRedirectStrategy源代码):

 HttpContext context = new BasicHttpContext(); HttpResult<T> result = client.execute(request, handler, context); URI finalUrl = request.getURI(); RedirectLocations locations = (RedirectLocations) context.getAttribute(DefaultRedirectStrategy.REDIRECT_LOCATIONS); if (locations != null) { finalUrl = locations.getAll().get(locations.getAll().size() - 1); } 

由于HttpClient 4.3.x,上面的代码可以简化为:

 HttpClientContext context = HttpClientContext.create(); HttpResult<T> result = client.execute(request, handler, context); URI finalUrl = request.getURI(); List<URI> locations = context.getRedirectLocations(); if (locations != null) { finalUrl = locations.get(locations.size() - 1); } 
  HttpGet httpGet = new HttpHead("<put your URL here>"); HttpClient httpClient = HttpClients.createDefault(); HttpClientContext context = HttpClientContext.create(); httpClient.execute(httpGet, context); List<URI> redirectURIs = context.getRedirectLocations(); if (redirectURIs != null && !redirectURIs.isEmpty()) { for (URI redirectURI : redirectURIs) { System.out.println("Redirect URI: " + redirectURI); } URI finalURI = redirectURIs.get(redirectURIs.size() - 1); } 

根据ZZ编码器的解决scheme,恕我直言改进的方式是使用ResponseInterceptor来简单地跟踪最后的redirect位置。 这样,你就不会失去信息,比如在标签之后。 如果没有响应拦截器,你会失去hashtag。 例如: http : //j.mp/OxbI23

 private static HttpClient createHttpClient() throws NoSuchAlgorithmException, KeyManagementException { SSLContext sslContext = SSLContext.getInstance("SSL"); TrustManager[] trustAllCerts = new TrustManager[] { new TrustAllTrustManager() }; sslContext.init(null, trustAllCerts, new java.security.SecureRandom()); SSLSocketFactory sslSocketFactory = new SSLSocketFactory(sslContext); SchemeRegistry schemeRegistry = new SchemeRegistry(); schemeRegistry.register(new Scheme("https", 443, sslSocketFactory)); schemeRegistry.register(new Scheme("http", 80, new PlainSocketFactory())); HttpParams params = new BasicHttpParams(); ClientConnectionManager cm = new org.apache.http.impl.conn.SingleClientConnManager(schemeRegistry); // some pages require a user agent AbstractHttpClient httpClient = new DefaultHttpClient(cm, params); HttpProtocolParams.setUserAgent(httpClient.getParams(), "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:13.0) Gecko/20100101 Firefox/13.0.1"); httpClient.setRedirectStrategy(new RedirectStrategy()); httpClient.addResponseInterceptor(new HttpResponseInterceptor() { @Override public void process(HttpResponse response, HttpContext context) throws HttpException, IOException { if (response.containsHeader("Location")) { Header[] locations = response.getHeaders("Location"); if (locations.length > 0) context.setAttribute(LAST_REDIRECT_URL, locations[0].getValue()); } } }); return httpClient; } private String getUrlAfterRedirects(HttpContext context) { String lastRedirectUrl = (String) context.getAttribute(LAST_REDIRECT_URL); if (lastRedirectUrl != null) return lastRedirectUrl; else { HttpUriRequest currentReq = (HttpUriRequest) context.getAttribute(ExecutionContext.HTTP_REQUEST); HttpHost currentHost = (HttpHost) context.getAttribute(ExecutionContext.HTTP_TARGET_HOST); String currentUrl = (currentReq.getURI().isAbsolute()) ? currentReq.getURI().toString() : (currentHost.toURI() + currentReq.getURI()); return currentUrl; } } public static final String LAST_REDIRECT_URL = "last_redirect_url"; 

像ZZ编码器的解决scheme一样使用它:

 HttpResponse response = httpClient.execute(httpGet, context); String url = getUrlAfterRedirects(context); 

我觉得更简单的方法来查找最后一个URL是使用DefaultRedirectHandler。

 package ru.test.test; import java.net.URI; import org.apache.http.HttpResponse; import org.apache.http.ProtocolException; import org.apache.http.impl.client.DefaultRedirectHandler; import org.apache.http.protocol.HttpContext; public class MyRedirectHandler extends DefaultRedirectHandler { public URI lastRedirectedUri; @Override public boolean isRedirectRequested(HttpResponse response, HttpContext context) { return super.isRedirectRequested(response, context); } @Override public URI getLocationURI(HttpResponse response, HttpContext context) throws ProtocolException { lastRedirectedUri = super.getLocationURI(response, context); return lastRedirectedUri; } } 

使用此处理程序的代码:

  DefaultHttpClient httpclient = new DefaultHttpClient(); MyRedirectHandler handler = new MyRedirectHandler(); httpclient.setRedirectHandler(handler); HttpGet get = new HttpGet(url); HttpResponse response = httpclient.execute(get); HttpEntity entity = response.getEntity(); lastUrl = url; if(handler.lastRedirectedUri != null){ lastUrl = handler.lastRedirectedUri.toString(); } 

我发现这在HttpComponents客户端文档上

 CloseableHttpClient httpclient = HttpClients.createDefault(); HttpClientContext context = HttpClientContext.create(); HttpGet httpget = new HttpGet("http://localhost:8080/"); CloseableHttpResponse response = httpclient.execute(httpget, context); try { HttpHost target = context.getTargetHost(); List<URI> redirectLocations = context.getRedirectLocations(); URI location = URIUtils.resolve(httpget.getURI(), target, redirectLocations); System.out.println("Final HTTP location: " + location.toASCIIString()); // Expected to be an absolute URI } finally { response.close(); } 

在2.3版本中,Android仍然不支持以下redirect(HTTP代码302)。 我刚刚阅读位置标题并再次下载:

 if (statusCode != HttpStatus.SC_OK) { Header[] headers = response.getHeaders("Location"); if (headers != null && headers.length != 0) { String newUrl = headers[headers.length - 1].getValue(); // call again the same downloading method with new URL return downloadBitmap(newUrl); } else { return null; } } 

没有通告redirect保护,所以要小心。 更多关于博客使用AndroidHttpClient遵循302redirect

这是我设法得到redirecturl:

 Header[] arr = httpResponse.getHeaders("Location"); for (Header head : arr){ String whatever = arr.getValue(); } 

或者,如果您确定只有一个redirect位置,请执行以下操作:

 httpResponse.getFirstHeader("Location").getValue();