Search code examples
wget

WGET how to exclude "domain.com/?p=" pages


When using WGET -m to archive a website, what is the command to exclude all pages with the address /?p= after the domain ?

example.. exclude these pages...

www.domain.com/?p=1
www.domain.com/?p=2
www.domain.com/?p=3

I know there's the -X option to exclude a list of folders but these are not really folders. Also there's the reject -R option but this only applies to file name suffixes which this is also not. ?


Solution

  • I suggest trying --reject-regex urlregex option

    Specify a regular expression to(...)reject the complete URL.