Search code examples
perlhttp-proxy

How can I make HTTP::Proxy work with HTTPS URLs?


In the following code sample, I start a proxy server using HTTP::Proxy and attempt to use it to request an HTTPS URL, but the proxy server either doesn't actually make the request, or doesn't return the response. However, if I make the URL use HTTP (not secure), the request succeeds. I've installed both IO::Socket::SSL and LWP::UserAgent::https (yay secret deps!), but am still unable to get HTTPS requests to go through the proxy. How can I get HTTP::Proxy to work with HTTPS URLs?

Here's my code:

#!/usr/bin/env perl

use strict;
use warnings;
use Data::Printer;
use HTTP::Proxy ':log';
use Mojo::UserAgent ();

my $URL = 'https://www.yahoo.com';
my $PROXY_PORT = 8667;

my $pid = fork();

if ($pid) { # I am the parent
    print "Press ^c to kill proxy server...\n";
    my $proxy = HTTP::Proxy->new( port => $PROXY_PORT );
    $proxy->logmask(ALL);
    $proxy->via(q{});
    $proxy->x_forwarded_for(0);
    $proxy->start;

    waitpid $pid, 0;
}
elsif ($pid == 0) { # I am the child
    sleep 3; # Allow the proxy server to start

    my $ua = Mojo::UserAgent->new;
    $ua->proxy
        ->http("http://127.0.0.1:$PROXY_PORT")
        ->https("http://127.0.0.1:$PROXY_PORT");

    my $tx = $ua->get($URL);

    if ($tx->error) {
        p $tx->error;
    }
    else {
        print "Success!\n";
    }
}
else {
    die 'Unknown result after forking';
}

Saving the above script as testcase-so.pl and running it:

$ MOJO_CLIENT_DEBUG=1 ./testcase-so.pl
Press ^c to kill proxy server...
-- Blocking request (https://www.yahoo.com)
-- Connect c66a92739c09c76fa24029e8079808c7 (https://www.yahoo.com:443)
-- Client >>> Server (https://www.yahoo.com)
CONNECT www.yahoo.com:443 HTTP/1.1\x0d
User-Agent: Mojolicious (Perl)\x0d
Content-Length: 0\x0d
Host: www.yahoo.com\x0d
Accept-Encoding: gzip\x0d
\x0d

-- Client >>> Server (https://www.yahoo.com)

[Tue Oct  9 12:02:54 2018] (12348) PROCESS: Forked child process 12352
[Tue Oct  9 12:02:54 2018] (12352) SOCKET: New connection from 127.0.0.1:45312
[Tue Oct  9 12:02:54 2018] (12352) REQUEST: CONNECT www.yahoo.com:443
[Tue Oct  9 12:02:54 2018] (12352) REQUEST: Accept-Encoding: gzip
[Tue Oct  9 12:02:54 2018] (12352) REQUEST: Host: www.yahoo.com
[Tue Oct  9 12:02:54 2018] (12352) REQUEST: User-Agent: Mojolicious (Perl)
[Tue Oct  9 12:02:54 2018] (12352) REQUEST: Content-Length: 0
[Tue Oct  9 12:02:54 2018] (12352) RESPONSE: 200 OK
[Tue Oct  9 12:02:54 2018] (12352) RESPONSE: Date: Tue, 09 Oct 2018 12:02:54 GMT
[Tue Oct  9 12:02:54 2018] (12352) RESPONSE: Transfer-Encoding: chunked
[Tue Oct  9 12:02:54 2018] (12352) RESPONSE: Server: HTTP::Proxy/0.304
-- Client <<< Server (https://www.yahoo.com)
HTTP/1.1 200 OK\x0d
Date: Tue, 09 Oct 2018 12:02:54 GMT\x0d
Transfer-Encoding: chunked\x0d
Server: HTTP::Proxy/0.304\x0d
\x0d

[Tue Oct  9 12:03:14 2018] (12352) CONNECT: Connection closed by the client
[Tue Oct  9 12:03:14 2018] (12352) PROCESS: Served 1 requests
[Tue Oct  9 12:03:14 2018] (12352) CONNECT: End of CONNECT proxyfication
\ {
    message   "Proxy connection failed"
}
[Tue Oct  9 12:03:15 2018] (12348) PROCESS: Reaped child process 12349
[Tue Oct  9 12:03:15 2018] (12348) PROCESS: 1 remaining kids: 12352
[Tue Oct  9 12:03:15 2018] (12348) PROCESS: Reaped child process 12352
[Tue Oct  9 12:03:15 2018] (12348) PROCESS: 0 remaining kids:
^C[Tue Oct  9 12:04:04 2018] (12348) STATUS: Processed 2 connection(s)
$

And with the $URL switched to not use https:

$ MOJO_CLIENT_DEBUG=1 ./testcase-so.pl
Press ^c to kill proxy server...
-- Blocking request (http://www.yahoo.com)
-- Connect f792ee97a0362ab493575d8116e69e59 (http://127.0.0.1:8667)
-- Client >>> Server (http://www.yahoo.com)
GET http://www.yahoo.com HTTP/1.1\x0d
Accept-Encoding: gzip\x0d
Content-Length: 0\x0d
Host: www.yahoo.com\x0d
User-Agent: Mojolicious (Perl)\x0d
\x0d

[Tue Oct  9 12:09:38 2018] (12656) PROCESS: Forked child process 12659
-- Client >>> Server (http://www.yahoo.com)

[Tue Oct  9 12:09:38 2018] (12659) SOCKET: New connection from 127.0.0.1:58288
[Tue Oct  9 12:09:38 2018] (12659) REQUEST: GET http://www.yahoo.com
[Tue Oct  9 12:09:38 2018] (12659) REQUEST: Accept-Encoding: gzip
[Tue Oct  9 12:09:38 2018] (12659) REQUEST: Host: www.yahoo.com
[Tue Oct  9 12:09:38 2018] (12659) REQUEST: User-Agent: Mojolicious (Perl)
[Tue Oct  9 12:09:38 2018] (12659) REQUEST: Content-Length: 0
[Tue Oct  9 12:09:38 2018] (12659) RESPONSE: 301 Moved Permanently
[Tue Oct  9 12:09:38 2018] (12659) RESPONSE: Cache-Control: no-store, no-cache
[Tue Oct  9 12:09:38 2018] (12659) RESPONSE: Date: Tue, 09 Oct 2018 14:10:01 GMT
[Tue Oct  9 12:09:38 2018] (12659) RESPONSE: Transfer-Encoding: chunked
[Tue Oct  9 12:09:38 2018] (12659) RESPONSE: Via: http/1.1 media-router-fp1006.prod.media.bf1.yahoo.com (ApacheTrafficServer [c s f ])
[Tue Oct  9 12:09:38 2018] (12659) RESPONSE: Location: https://www.yahoo.com/
[Tue Oct  9 12:09:38 2018] (12659) RESPONSE: Server: ATS
[Tue Oct  9 12:09:38 2018] (12659) RESPONSE: Content-Language: en
[Tue Oct  9 12:09:38 2018] (12659) RESPONSE: Content-Length: 8
[Tue Oct  9 12:09:38 2018] (12659) RESPONSE: Content-Type: text/html
[Tue Oct  9 12:09:38 2018] (12659) RESPONSE: Content-Security-Policy: sandbox allow-forms allow-same-origin allow-scripts allow-popups allow-popups-to-escape-sandbox allow-presentation; report-uri https://csp.yahoo.com/beacon/csp?src=ats&site=frontpage&region=US&lang=en-US&device=desktop&yrid=&partner=;
[Tue Oct  9 12:09:38 2018] (12659) RESPONSE: X-Frame-Options: SAMEORIGIN
[Tue Oct  9 12:09:38 2018] (12659) RESPONSE: X-XSS-Protection: 1; report="https://csp.yahoo.com/beacon/csp?src=fp-hpkp-www"
-- Client <<< Server (http://www.yahoo.com)
HTTP/1.1 301 Moved Permanently\x0d
Cache-Control: no-store, no-cache\x0d
Date: Tue, 09 Oct 2018 14:10:01 GMT\x0d
Transfer-Encoding: chunked\x0d
Via: http/1.1 media-router-fp1006.prod.media.bf1.yahoo.com (ApacheTrafficServer [c s f ])\x0d
Location: https://www.yahoo.com/\x0d
Server: ATS\x0d
Content-Language: en\x0d
Content-Length: 8\x0d
Content-Type: text/html\x0d
Content-Security-Policy: sandbox allow-forms allow-same-origin allow-scripts allow-popups allow-popups-to-escape-sandbox allow-presentation; report-uri https://csp.yahoo.com/beacon/csp?src=ats&site=frontpage&region=US&lang=en-US&device=desktop&yrid=&partner=;\x0d
X-Frame-Options: SAMEORIGIN\x0d
X-XSS-Protection: 1; report="https://csp.yahoo.com/beacon/csp?src=fp-hpkp-www"\x0d
\x0d

-- Client <<< Server (http://www.yahoo.com)
8\x0d
redirect\x0d
0\x0d
\x0d

Success!
[Tue Oct  9 12:09:38 2018] (12659) SOCKET: Getting request failed: Client closed
[Tue Oct  9 12:09:39 2018] (12656) PROCESS: Reaped child process 12657
[Tue Oct  9 12:09:39 2018] (12656) PROCESS: 1 remaining kids: 12659
[Tue Oct  9 12:09:39 2018] (12656) PROCESS: Reaped child process 12659
[Tue Oct  9 12:09:39 2018] (12656) PROCESS: 0 remaining kids:
^C[Tue Oct  9 12:09:45 2018] (12656) STATUS: Processed 2 connection(s)
$

Solution

  • There is a bug in HTTP::Proxy in that it returns the wrong response to a CONNECT request:

    -- Client <<< Server (https://www.yahoo.com)
    HTTP/1.1 200 OK\x0d
    Date: Tue, 09 Oct 2018 12:02:54 GMT\x0d
    Transfer-Encoding: chunked\x0d
    Server: HTTP::Proxy/0.304\x0d
    \x0d
    

    The response to a CONNECT request can have no body which means that it should not have a HTTP header announcing a body like Transfer-Encoding: chunked does. This bug happens with all clients which do a CONNECT request using HTTP/1.1. If the CONNECT is instead done with HTTP/1.0 the problem vanishes since Transfer-Encoding: chunked is not defined with HTTP/1.0 yet and thus HTTP::Proxy does not send it.

    The same problem happens when trying to use curl with HTTP::Proxy, thus this is not a problem solely of Mojo::UserAgent. I`ve made a patch to HTTP::Proxy to respond properly. See this pull request for the details and for the (small) diff you need to apply.