I have web scraping script (Perl language) in the first time it works perfectly but after ~ 3500 ( GET request ) server return 403 error ( Forbidden | not ip banned ) but when use the same script in ( python language ) i find the same problem work but after ~ 3500 requests i get 403 ( retrun to work after 24 heures ) i don't know what is the problem and how i can fix it
i read about libwww-perl :
https://cloudkul.com/blog/block-libwww-perl-attack-in-apache-web-server/
Use agent
method provided by LWP::UserAgent
to change "user agent identification string".
It should solve blocking based on client identification string.
It will not solve blocking based on abusive behavior.
perldoc LWP::UserAgent
agent
my $agent = $ua->agent; $ua->agent('Checkbot/0.4 '); # append the default to the end $ua->agent('Mozilla/5.0'); $ua->agent(""); # don't identify
Get/set the product token that is used to identify the user agent on the network. The agent value is sent as the
User-Agent
header in the requests.The default is a string of the form
libwww-perl/#.###
, where#.###
is substituted with the version number of this library.If the provided string ends with space, the default
libwww-perl/#.###
string is appended to it.The user agent string should be one or more simple product identifiers with an optional version number separated by the
/
character.