Search code examples
linuxhttpurlcommand

Is it possible to read only first N bytes from the HTTP server using Linux command?


Given the url http://www.example.com, can we read the first N bytes out of the page?

  • using wget, we can download the whole page.

  • using curl, there is -r, 0-499 specifies the first 500 bytes. Seems solve the problem.

    You should also be aware that many HTTP/1.1 servers do not have this feature enabled, so that when you attempt to get a range, you'll instead get the whole document.

  • using urlib in Python. Similar question here, but according to Konstantin's comment, is that really true?

    Last time I tried this technique it failed because it was actually impossible to read from the HTTP server only specified amount of data, i.e. you implicitly read all HTTP response and only then read first N bytes out of it. So at the end you ended up downloading the whole 1Gb malicious response.

So, how can we read the first N bytes from the HTTP server in practice?


Solution

  • curl <url> | head -c 499
    

    or

    curl <url> | dd bs=1 count=499
    

    should do

    Also there are simpler utils with perhaps broader availability like

        netcat host 80 <<"HERE" | dd count=499 of=output.fragment
    GET /urlpath/query?string=more&bloddy=stuff
    
    HERE
    

    Or

    GET /urlpath/query?string=more&bloddy=stuff