Search code examples
wgetmoinmoin

Creating a static copy of a MoinMoin site


I have a MoinMoin site which I've inherited from a previous system administrator. I'd like to shut it down but keep a static copy of the content as an archive, ideally with the same URLs. At the moment I'm trying to accomplish this using wget with the following parameters:

--mirror
--convert-links
--page-requisites
--no-parent
-w 1
-e robots=off
-user-agent="Mozilla/5.0"
-4

This seems to work for getting the HTML and CSS, but it fails to download any of the attachments. Is there an argument I can add to wget which will get round this problem?

Alternatively, is there a way I can tell MoinMoin to link directly to files in the HTML it produces? If I could do that then I think wget would "just work" and download all the attachments. I'm not bothered about the attachment URLs changing as they won't have been linked to directly in other places (e.g. email archives).

The site is running MoinMoin 1.9.x.

My version of wget:

$ wget --version
GNU Wget 1.16.1 built on linux-gnu.

+digest +https +ipv6 +iri +large-file +nls +ntlm +opie -psl +ssl/openssl

Solution

  • The solution in the end was to use MoinMoin's export dump functionality:

    https://moinmo.in/FeatureRequests/MoinExportDump

    It doesn't preserve the file paths in the way that wget does, but has the major advantage of including all the files and the attachments.