Search code examples
pythonwindowssubprocesshttrack

Using subprocess to run HTTrack from python in Windows


I'm in the process of writing a web scraping python script, and one of the things I'd like it to be able to do is have it take a snapshot of certain pages (all of the html, style sheets, and images necessary to view that particular page properly offline). Seems like HTTrack is a good way to do that, and I thought I would be able to call it from within the python script using

subprocess.call(["httrack", "http://www.example.com", "-O", "\tmp\example"])

But attempting to do this results in "FileNotFoundError: [WinError 2] The system cannot find the file specified". I've also tried giving it the full file path,

subprocess.call(["C:\Program Files\WinHTTrack\httrack.exe", "http://www.example.com", "-O", "\tmp\Example"])

but I get the error "SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape"

I think this is a problem with me not understanding subprocess correctly, since I can get HTTrack working through windows command prompt. Can anyone help me understand the correct way to use subprocess?


Solution

  • Resolved thanks to eryksun's comment. It wasn't a problem with the subprocess syntax at all, but rather that I wasn't being careful about escaping all of my backslashes. Pulling r in front of those strings to make them raw strings fixed up my code just fine.