Search code examples
bashcurlweb-crawlerwgetlynx

Creating a static copy of a web page on UNIX command line or shell script


I need to create a static copy of a web page (all media resources, like CSS, images and JS included) in a shell script. This copy should be openable offline in any browser.

Some browsers have a similar functionality (Save As... Web Page, complete) which create a folder from a page and rewrite external resources as relative static resources in this folder.

What's a way to accomplish and automatize this on Linux command line to a given URL?


Solution

  • You can use wget like this:

    wget --recursive --convert-links --domains=example.org http://www.example.org
    

    this command will recursively download any page reachable by hyperlinks from the page at www.example.org not following links outside the example.org domain.

    Check wget manual page for more options for controlling recursion.