Search code examples
linuxcurlwgetcontent-disposition

wget breaking with content-disposition


I am trying to download the kml file that is sent via Content-Disposition:attachment from the following website:

http://waterwatch.usgs.gov/index.php?m=real&w=kml&r=us&regions=ia

Using wget and curl with the command:

wget --content-disposition http://waterwatch.usgs.gov/index.php?m=real&w=kml&r=us&regions=ia

and

curl -O -J -L http://waterwatch.usgs.gov/index.php?m=real&w=kml&r=us&regions=ia

However, instead of saving the file being transmitter, it saves only the html content and at the end of the transmission it gets stuck. The terminal return is:

$wget --content-disposition http://waterwatch.usgs.gov/index.php?m=real&w=kml&r=us&regions=ia
[1] 32260
[2] 32261
[3] 32262
work@Aspire-V3-471:~$ --2016-05-13 19:37:54--  http://waterwatch.usgs.gov/index.php?m=real
Resolving waterwatch.usgs.gov (waterwatch.usgs.gov)... 2001:49c8:0:126c::56, 137.227.242.56
Connecting to waterwatch.usgs.gov (waterwatch.usgs.gov)|2001:49c8:0:126c::56|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: ‘index.php?m=real.5’

    [  <=>                                                                                                                                                  ] 41.637       174KB/s   in 0,2s   

2016-05-13 19:37:55 (174 KB/s) - ‘index.php?m=real.5’ saved [41637]

And them it got stuch and I need to press Ctrl+C. As the header I get is

HTTP/1.1 200 OK
Date: Sat, 14 May 2016 00:19:21 GMT
Content-Disposition: attachment; filename="real_ia.kml"
Keep-Alive: timeout=5, max=100
Connection: Keep-Alive
Transfer-Encoding: chunked
Content-Type: application/vnd.google-earth.kml+xml
X-Frame-Options: SAMEORIGIN

I would expect to have the 'real_ia.kml' file downloaded. A similar result is given using curl command.

Why does it get stuck and downloads only the HTML content?


Solution

  • The & symbols are interpreted as the shell special character which causes a command to run in background(to fork). So you should escape or quote them:

    curl -O -J -L 'http://waterwatch.usgs.gov/index.php?m=real&w=kml&r=us&regions=ia'
    

    In the command above we used full quoting.

    The following lines from your output mean that three commands are being forked to background:

    [1] 32260
    [2] 32261
    [3] 32262
    

    The numbers at the left (in brackets) are job numbers. You can bring a job to foreground by typing fg N, where N is the number of the job. The numbers at the right are process IDs.