Search code examples
htmlhttpgetdownloadwget

How to download HTTP directory with all files and sub-directories as they appear on the online files/folders list?


There is an online HTTP directory that I have access to. I have tried to download all sub-directories and files via wget. But, the problem is that when wget downloads sub-directories it downloads the index.html file which contains the list of files in that directory without downloading the files themselves.

Is there a way to download the sub-directories and files without depth limit (as if the directory I want to download is just a folder which I want to copy to my computer).

online HTTP directory


Solution

  • Solution:

    wget -r -np -nH --cut-dirs=3 -R index.html http://hostname/aaa/bbb/ccc/ddd/
    

    Explanation:

    • It will download all files and subfolders in ddd directory
    • -r : recursively
    • -np : not going to upper directories, like ccc/…
    • -nH : not saving files to hostname folder
    • --cut-dirs=3 : but saving it to ddd by omitting first 3 folders aaa, bbb, ccc
    • -R index.html : excluding index.html files

    Reference: http://bmwieczorek.wordpress.com/2008/10/01/wget-recursively-download-all-files-from-certain-directory-listed-by-apache/