Search code examples
perlmojoliciousmojo-useragent

Mojo::UserAgent - Inspect the Content-Encoding header before decoding


I'm attempting use Mojo::UserAgent to verify the gzip compression (Content-Encoding) of an application.

Unfortunately, it appears that this UA silently decodes the content and removes the Content-Encoding header afterwords.

The following is my minimal example

#!/usr/bin/env perl

use strict;
use warnings;

use Test::More tests => 3;

use Mojo::UserAgent;     # Version 8.26

my $ua = Mojo::UserAgent->new();

# As documented: https://docs.mojolicious.org/Mojolicious/Guides/Cookbook#Decorating-follow-up-requests
$ua->once(
    start => sub {
        my ( $ua, $tx ) = @_;
        $tx->req->headers->header( 'Accept-Encoding' => 'gzip' );
    }
);

my $tx = $ua->get('https://www.mojolicious.org');

is( $tx->req->headers->header('Accept-Encoding'), 'gzip', qq{Request Accept-Encoding is "gzip"} );

ok( $tx->res->is_success, "Response is success" );

# The following assertion fails.
# My theory is that Mojo::UserAgent is silently decoding the content, and changing
# the Content-Encoding and Content-Length to reflect the new values.  However, how
# do we inspect what the original response headers were?
is( $tx->res->headers->header('Content-Encoding'), 'gzip', qq{Response Content-Encoding is "gzip"} );

Results

$ perl mojo_useragent_content_encoding.pl
1..3
ok 1 - Request Accept-Encoding is "gzip"
ok 2 - Response is success
not ok 3 - Response Content-Encoding is "gzip"
#   Failed test 'Response Content-Encoding is "gzip"'
#   at mojo_useragent_content_encoding.pl line 30.
#          got: undef
#     expected: 'gzip'
# Looks like you failed 1 test of 3.

I was able to confirm that the payload is being gzip'd by analyzing the Apache logs. Additionally, this curl also confirms this example website is utilizing gzip encoding for requests

$ curl -i -H "Accept-Encoding: gzip" https://www.mojolicious.org
HTTP/1.1 200 OK
Date: Mon, 18 Jan 2021 21:28:14 GMT
Content-Type: text/html;charset=UTF-8
...
Content-Encoding: gzip
...

I am able to use LWP::UserAgent to confirm the proper Content-Encoding of the response.

However, I'm unable to determine how to inspect the Mojo::UserAgent response to view the real headers before any theoretical post processing was performed.


Solution

  • You can set $ua->transactor->compressed(0); in your code or MOJO_GZIP=0 in your env to bypass auto decompression.

    If you want to keep auto decompression and examine the headers before the decompression stage is reached (which also removes the Content-Encoding header) you can register a callback on the contents body event. This event is emitted after the headers are parsed but before the body is processed.

    use strict ;
    use warnings;
    use 5.30.0;
    use Test::More tests => 3;
    use Data::Dumper;
    use Mojo::UserAgent;     # Version 8.26
    
    my $ua = Mojo::UserAgent->new();
    
    # As documented: https://docs.mojolicious.org/Mojolicious/Guides/Cookbook#Decorating-follow-up-requests
    $ua->once(
              start => sub {
                  my ( $ua, $tx ) = @_;
                  $tx->req->headers->header( 'Accept-Encoding' => 'gzip' );
    
                  my $res = $tx->res;
                  say 'register event listener';
                  $res->content->on(body=>sub{test_res_encoding($tx)});
              }
          );
    $ua->transactor->compressed(0);
    
    my $tx = $ua->get('https://www.mojolicious.org');
    
    is( $tx->req->headers->header('Accept-Encoding'), 'gzip', qq{Request Accept-Encoding is "gzip"} );
    
    ok( $tx->res->is_success, "Response is success" );
    
    #say Dumper $tx->res->headers;
    # The following assertion fails.
    # My theory is that Mojo::UserAgent is silently decoding the content, and changing
    # the Content-Encoding and Content-Length to reflect the new values.  However, how
    # do we inspect what the original response headers were?
    sub test_res_encoding{
        my $tx = shift;
        is( $tx->res->headers->header('Content-Encoding'),
            'gzip',
            qq{Response Content-Encoding is "gzip"} );
    }
    

    Setting MOJO_EVENTEMITTER_DEBUG=1 in your env helps to see what is going on.