I'm able to grab the first image fine, but then the content seems to be looping inside itself. Not sure what I'm doing wrong.
#!/usr/bin/perl
use LWP::Simple;
use LWP::UserAgent;
my $ua = LWP::UserAgent->new;
for(my $id=1;$id<55;$id++)
{
my $response = $ua->get("http://www.gamereplays.org/community/index.php?act=medals&CODE=showmedal&MDSID=" . $id );
my $content = $response->content;
for(my $id2=1;$id2<10;$id2++)
{
$content =~ /<img src="http:\/\/www\.gamereplays.org\/community\/style_medals\/(.*)$id2\.gif" alt=""\/>/;
$url = "http://www.gamereplays.org/community/style_medals/" . $1 . $id2 . ".gif";
print "--\n\r";
print "ID: ".$id."\n\r";
print "ID2: ".$id2."\n\r";
print "URL: ".$url."\n\r";
print "1: ".$1."\n\r";
print "--\n\r";
getstore($url, $1 . $id2 . ".gif");
}
}
As others have stated, this is really a job for an HTML::Parser. Also, you should 'use strict;' and remove use LWP::Simple as you're not using the library.
You could change your regex to the following:
$content =~ m{http://www\.gamereplays\.org/community/style_medals/([\w\_]+)$id2\.gif}s;
But you won't get style_medals/comp_graphics_10.gif - which may be what you want. I think something like the following would work better. My apologies for the style changes but I can't resist modifying for PBP.
#!/usr/bin/perl
use LWP::UserAgent;
use Carp;
use strict;
my $ua = LWP::UserAgent->new();
# Fetch pages from 1 to 55. Are we sure we won't have page 56?
# Perhaps consider running until a 404 is found.
for (my $id = 1; $id < 55; $id++) {
# Get the page data
my $response = $ua->get( 'http://www.gamereplays.org/community/index.php?ac\
t=medals&CODE=showmedal&MDSID='.$id );
# Check for failure and abort
if (!defined $response || !$response->is_success) {
croak 'Request failed! '.$response->status_line();
}
my $content = $response->content();
# Run this loop each time we find the url
CONTENT_LOOP:
while ($content =~ s{<img src="(http://www\.gamereplays\.org/community/styl\
e_medals/([^\"]+))" }{}ms) {
my $url = $1; # The entire url, no need to recreate the domain
my $file = $2; # Just the file name portion
my ($id2) = $file =~ m{ _(\d+)\.gif \Z}xms; # extract id2 for debug
next CONTENT_LOOP if !defined $id2; # Handle SOTW.gif file(s)
# Display stats about each id found
print "--\n";
print "ID: $id\n";
print "ID2: $id2\n";
print "URL: $url\n";
print "1: $file\n";
print "--\n";
# You might want to consider involving the $id in the filename as
# you could have the same filename on multiple pages
getstore( $url, $file);
}
}