Search code examples

Creating directories with wget

I need to download files from several pages, using wget -r -l 1 -nd -H --accept-regex '[0-9]{4}/[0-9]{3}.pdf' -i list.txt; in the TXT file I have a list of all the pages from which I need to download, one per line, like

and so on.

I'm trying to create different folders for each source, so that all the files downloaded from are in a folder named 001, all those from are in a folder 002, and so on.

How could I do that?


  • You might use -P to instruct GNU wget to store download e.g.

    wget -P examplepage -np -r -l 1

    will store what it download inside examplepage directory. Said directory will be created if it does not exists yet.

    I'm trying to create different folders for each source, so that all the files downloaded from are in a folder named 001, all those from are in a folder 002, and so on.

    I do not know if it possible with single wget call. You might use loop to process file line by line, for example let say urls.txt content is

    and I wish 1st to into directory named 001, 2nd into directory named 002, 3rd into directory named 003 I could do that by

    while read line; do
        dirname=$(echo "$line" | sed 's/.*page=//')
        wget -P "$dirname" "$line"
    done < urls.txt

    Explanation: I use while loop to process file named urls.txt line by line, I use GNU sed to prepare directory name by removing everything up to page= from url.