Search code examples
unixdownloadautomationhttrack

httrack follow redirects


I try to mirror webpages recursively starting from URL supplied by user (there is a depth limit set of course). Wget didn't catch links from css/js so I decided to use httrack.

I try to mirror some site like this:

# httrack <http://onet.pl> -r6 --ext-depth=6 -O ./a "+*"

This website uses redirect (301) to http://www.onet.pl:80, httrack just downloads index.html page with:

<a HREF="onet.pl/index.html" >Page has moved</a>

and nothing more! When I run:

# httrack <http://www.onet.pl> -r6 --ext-depth=6 -O ./a "+*"

it does what I want.

Is there any way to make httrack following redirects? Currently I just add "www."+url to httrack's URLs but it's not a real solution (doesn't cover all user cases). Are there any better website mirroring tools for linux?


Solution

  • On main httrack forum one of developers said that it's not possible.

    Proper solution is to use another web mirroring tool.