i'm trying to make a youtube downloader script with perl by filling a youtube downloader site form with youtube link, submit it and accepting the file to download.
i have tried many different codes but i couldn't fill form, i think it's because the form is in < div > tags.
how can i do that ?
thank you.
#!/usr/bin/perl
use strict;
use warnings;
use CGI;
use WWW::Mechanize;
my $url = 'http://www.youtube-mp3.org';
my $m = WWW::Mechanize->new();
my $textbox = 'youtube-url';
my $youtubelink = "http\:\/\/www.youtube.com\/watch\?v\=lDK9QqIzhwk";
my $cgi = CGI->new();
my $form = $cgi->Vars;
$m->get($url);
$m -> form_number ('1');
$m -> field($youtubelink, $form->{$textbox});
$m -> submit();
$m->submit();
print $m -> content();
$m->follow_link(text_regex => qr/Download/i);
my $response = $m->res();
my $filename = $response->filename;
if (! open ( FOUT, ">$filename" ) ) {
die("Could not create file: $!" );
}
print( FOUT $m->response->content() );
close( FOUT );
In the line
$m -> field($youtubelink, $form->{$textbox});
your are passing a value which corresponding to the value attribute of first input, however the field method requires not the value but the name. (besides that the input doesn't have a name attribute)
I also think you cannot just fill the fields on form and then submit it. If you see the source code it will run some javascript code before submitting the form
A look to what happens on network when we click on submit button it shows a sequence of two GET methods
The first is this
http://www.youtube-mp3.org/a/pushItem/?item=http%3A//www.youtube.com/watch%3Fv%3DKMU0tzLwhbE&el=na&bf=false&r=1377433712868
returning
KMU0tzLwhbE
and then a second call (notice the video_id is the value returned by first GET)
http://www.youtube-mp3.org/a/itemInfo/?video_id=KMU0tzLwhbE&ac=www&t=grp&r=1377433713366
which returns
info = { "title" : "Developers", "image" : "http://i.ytimg.com/vi/KMU0tzLwhbE/default.jpg", "length" : "3", "status" : "serving", "progress_speed" : "", "progress" : "", "ads" : "", "pf" : "", "h" : "c21be9b00d0ec12ea980bf828d09440c" };
So my suggestion is to prepare the first URL, post it, extract the answer and then post the second URL composed with the answer from first request.