Search code examples
filesizedirectory-structurerclone

How can I find the size of an entire online open directory?


I would like to know the total size of an open directory online. How can I do this?

An example use-case might be to find out how large a repository mirror would be.

How can I find the total size of open directories such as: http://old-releases.ubuntu.com/, or http://download.kiwix.org/, or http://apollo.sese.asu.edu/data/?


Solution

  • This is a fairly quick and simple way to find the size of any open directory online.

    TL;DR

    Install rclone and replace the URL with whatever you want.

    Install Rclone (Binaries available here)

    curl https://rclone.org/install.sh | sudo bash
    

    Get Directory Size (Replace URL with any open directory, make sure not to remove :http:)

    rclone size --http-url http://old-releases.ubuntu.com/ :http:
    

    Explanation

    Using rclone + http with optional mount will do the trick.

    This gives you the freedom to check the size with all sorts of methods. rclone size http: or rclone mount http: directory/ then cd directory/ and du -sh or du -hd1 or ncdu (from here) or (NOT recommended) ls -shR

    Be Kind

    You might want to avoid hammering the server by adjusting the values and optionally adding/removing --fast-list in this command:

    rclone size http: -v --tpslimit 5 --bwlimit 500K --checkers 5 --fast-list

    Adjust up or down according to your needs and what you think the server can handle. For example, in just a couple minutes, I was able to use rclone size on a server that I thought would be fine with it, and got these results returned.

    Examples

    $ rclone size --http-url http://apollo.sese.asu.edu/data/ :http: --checkers 100
    
    Total objects: 195669
    Total size: 123.619 TBytes (135920738673216 Bytes)
    
    $ rclone size --http-url http://download.kiwix.org :http: --checkers 100
    Total objects: 24624
    Total size: 5.711 TBytes (6279328743124 Bytes)