I'm trying to save a Google Drive shared file using its actual filename received from the server (content-disposition):
I tried to analyze the header:
use strict;
use warnings;
use Data::Dumper;
use LWP::UserAgent qw( );
my $str = 'https://drive.google.com/file/d/0B6vqTWO9kmdmdzk5ejhDSXgzMDg/view?usp=sharing';
$str =~ /file\/d\/(\w+)/;
my $url = 'https://drive.google.com/uc?export=download&id='.$1;
my $ua = LWP::UserAgent->new();
my $response = $ua->head($url)->{'_headers'};
print Dumper( $response );
I got this:
$VAR1 = bless( {
'client-ssl-cert-subject' => '/C=US/ST=California/L=Mountain View/O=Google Inc/CN=*.googleusercontent.com',
'connection' => 'close',
'date' => 'Mon, 19 Oct 2015 14:45:45 GMT',
'content-type' => 'text/html; charset=UTF-8',
'x-guploader-uploadid' => 'AEnB2UrQJIoJUIIhWnKz9HAlW_2XKApLe_0IDMZjS0gGQOMdRaF68Od2xsxssp7mBdQP9kNrjvDueWUP5pSa1eHbprSjbPvfbA',
'alternate-protocol' => '443:quic,p=1',
'expires' => 'Mon, 19 Oct 2015 14:45:45 GMT',
'::std_case' => {
'client-ssl-socket-class' => 'Client-SSL-Socket-Class',
'client-ssl-cert-subject' => 'Client-SSL-Cert-Subject',
'client-ssl-cipher' => 'Client-SSL-Cipher',
'client-peer' => 'Client-Peer',
'x-guploader-uploadid' => 'X-GUploader-UploadID',
'alternate-protocol' => 'Alternate-Protocol',
'alt-svc' => 'Alt-Svc',
'client-ssl-cert-issuer' => 'Client-SSL-Cert-Issuer',
'client-date' => 'Client-Date',
'client-response-num' => 'Client-Response-Num'
},
'client-ssl-cert-issuer' => '/C=US/O=Google Inc/CN=Google Internet Authority G2',
'server' => 'UploadServer',
'client-date' => 'Mon, 19 Oct 2015 14:45:55 GMT',
'client-ssl-socket-class' => 'IO::Socket::SSL',
'client-ssl-cipher' => 'ECDHE-ECDSA-AES128-GCM-SHA256',
'client-peer' => '2a00:1450:400c:c08::84:443',
'alt-svc' => 'quic=":443"; p="1"; ma=604800',
'cache-control' => 'private, max-age=0',
'client-response-num' => 1
}, 'HTTP::Headers' );
I expected to find the content-disposition header within the above. On the other hand, wget correctly gives the filename:
#wget --content-disposition "https://drive.google.com/uc?export=download&id=0B6vqTWO9kmdmdzk5ejhDSXgzMDg"
--2015-10-19 20:21:37-- https://drive.google.com/uc?export=download&id=0B6vqTWO9kmdmdzk5ejhDSXgzMDg
Resolving drive.google.com (drive.google.com)... 2a00:1450:400c:c04::64, 74.125.206.139, 74.125.206.101, ...
Connecting to drive.google.com (drive.google.com)|2a00:1450:400c:c04::64|:443... connected.
HTTP request sent, awaiting response... 302 Moved Temporarily
Location: https://doc-0s-2s-docs.googleusercontent.com/docs/securesc/ha0ro937gcuc7l7deffksulhg5h7mbp1/unidk5uvfpl9kut1rl3hb5lqcvis8vdq/1445263200000/06380472059566149580/*/0B6vqTWO9kmdmdzk5ejhDSXgzMDg?e=download [following]
Warning: wildcards not supported in HTTP.
--2015-10-19 20:21:37-- https://doc-0s-2s-docs.googleusercontent.com/docs/securesc/ha0ro937gcuc7l7deffksulhg5h7mbp1/unidk5uvfpl9kut1rl3hb5lqcvis8vdq/1445263200000/06380472059566149580/*/0B6vqTWO9kmdmdzk5ejhDSXgzMDg?e=download
Resolving doc-0s-2s-docs.googleusercontent.com (doc-0s-2s-docs.googleusercontent.com)... 2a00:1450:400c:c08::84, 74.125.140.132
Connecting to doc-0s-2s-docs.googleusercontent.com (doc-0s-2s-docs.googleusercontent.com)|2a00:1450:400c:c08::84|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 16 [text/plain]
Saving to: ‘testdoc.txttestdoc.txt’
testdoc.txttestdoc.txt 100%[====================================================================================================================>] 16 --.-KB/s in 0s
2015-10-19 20:21:38 (550 KB/s) - ‘testdoc.txttestdoc.txt’ saved [16/16]
How can I get the correct filename from the server using perl?
The difference is that you're not doing a HEAD
with wget. If you look at the status of the response in your Perl code you're getting
503 Service Unavailable
which could refer to many things, but in this case means that HEAD
isn't supported. Change that for a GET
and all is well
use strict;
use warnings;
use feature 'say';
use LWP::UserAgent;
my $str = 'https://drive.google.com/file/d/0B6vqTWO9kmdmdzk5ejhDSXgzMDg/view?usp=sharing';
die unless $str =~ m{file/d/(\w+)};
my $url = "https://drive.google.com/uc?export=download&id=$1";
my $ua = LWP::UserAgent->new;
my $resp = $ua->get($url);
say $resp->status_line;
say $resp->header('Content-Disposition');
200 OK
attachment;filename="testdoc.txt";filename*=UTF-8''testdoc.txt