Search code examples
c#pdfcurlhttpclient.net-6.0

unable to reproduce curl postrequest with c# and httpClient


so I have this website

After inspecting the network traffic for the download button I got the below curl post request

curl "https://flood-map-for-planning.service.gov.uk/pdf" -X POST -H "User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:104.0) Gecko/20100101 Firefox/104.0" -H "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8" -H "Accept-Language: en-US,en;q=0.5" -H "Accept-Encoding: gzip, deflate, br" -H "Content-Type: application/x-www-form-urlencoded" -H "Origin: https://flood-map-for-planning.service.gov.uk" -H "Connection: keep-alive" -H "Referer: https://flood-map-for-planning.service.gov.uk/flood-zone-results?easting=429240&northing=431613&location=LS118TR" -H "Upgrade-Insecure-Requests: 1" -H "Sec-Fetch-Dest: document" -H "Sec-Fetch-Mode: navigate" -H "Sec-Fetch-Site: same-origin" -H "Sec-Fetch-User: ?1" -H "TE: trailers" --data-raw "id=1660136366038&polygon=&center="%"5B429240"%"2C431613"%"5D&reference=&scale=2500"

I went over to this website in order to convert the curl to c#

This is what I got

using (var httpClient = new HttpClient())
{
    using (var request = new HttpRequestMessage(new HttpMethod("POST"), "https://flood-map-for-planning.service.gov.uk/pdf"))
    {
        request.Headers.TryAddWithoutValidation("User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:104.0) Gecko/20100101 Firefox/104.0");
        request.Headers.TryAddWithoutValidation("Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8");
        request.Headers.TryAddWithoutValidation("Accept-Language", "en-US,en;q=0.5");
        request.Headers.TryAddWithoutValidation("Accept-Encoding", "gzip, deflate, br");
        request.Headers.TryAddWithoutValidation("Origin", "https://flood-map-for-planning.service.gov.uk");
        request.Headers.TryAddWithoutValidation("Connection", "keep-alive");
        request.Headers.TryAddWithoutValidation("Referer", "https://flood-map-for-planning.service.gov.uk/flood-zone-results?easting=429240&northing=431613&location=LS118TR");
        request.Headers.TryAddWithoutValidation("Upgrade-Insecure-Requests", "1");
        request.Headers.TryAddWithoutValidation("Sec-Fetch-Dest", "document");
        request.Headers.TryAddWithoutValidation("Sec-Fetch-Mode", "navigate");
        request.Headers.TryAddWithoutValidation("Sec-Fetch-Site", "same-origin");
        request.Headers.TryAddWithoutValidation("Sec-Fetch-User", "?1");
        request.Headers.TryAddWithoutValidation("TE", "trailers"); 

        request.Content = new StringContent("id=1660136366038&polygon=&center=");
        request.Content.Headers.ContentType = MediaTypeHeaderValue.Parse("application/x-www-form-urlencoded"); 

        var response = await httpClient.SendAsync(request);
    }
}

I changed it to:

var httpClient = new HttpClient();
var request =
       new HttpRequestMessage(new HttpMethod("POST"), "https://flood-map-for-planning.service.gov.uk/pdf");
request.Headers.TryAddWithoutValidation("User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:104.0) Gecko/20100101 Firefox/104.0");
request.Headers.TryAddWithoutValidation("Accept", "application/pdf"); 
request.Headers.TryAddWithoutValidation("Accept-Language", "en-US,en;q=0.5");
request.Headers.TryAddWithoutValidation("Accept-Encoding", "gzip, deflate, br");
request.Headers.TryAddWithoutValidation("Origin", "https://flood-map-for-planning.service.gov.uk");
request.Headers.TryAddWithoutValidation("Connection", "keep-alive");
request.Headers.TryAddWithoutValidation("Referer", "https://flood-map-for-planning.service.gov.uk/flood-zone-results?easting=429240&northing=431613&location=LS118TR");
request.Headers.TryAddWithoutValidation("Upgrade-Insecure-Requests", "1");
request.Headers.TryAddWithoutValidation("Sec-Fetch-Dest", "document");
request.Headers.TryAddWithoutValidation("Sec-Fetch-Mode", "navigate");
request.Headers.TryAddWithoutValidation("Sec-Fetch-Site", "same-origin");
request.Headers.TryAddWithoutValidation("Sec-Fetch-User", "?1");
request.Headers.TryAddWithoutValidation("TE", "trailers");

request.Content = new StringContent("center=&scale=2500");

var response =  httpClient.Send(request);
response.Content.Headers.Add("Content-Disposition", "inline;filename=\"Testpdf.pdf\"");
response.Content.Headers.Add("Content-Name", "Testpdf.PDF");
response.Content.Headers.Add("Content-Type", "application/pdf;charset=UTF-8");

if (response.IsSuccessStatusCode)
{

    using (FileStream fs = new FileStream("somepdf.pdf", FileMode.CreateNew))
    {
        using (StreamWriter writer = new StreamWriter(fs))
        {
            var contentStream =  response.Content.ReadAsStream(); // get the actual content stream
            writer.Write(contentStream);
        }
    }
}

This is the issue.

My goal is to download the pdf locally.

I usually get a file which is 1KB or 6KB.

The curl command with an output parameter works without an issue. I'm just not sure what the above c# http post request is missing.

As you can see I've added the filestream and streamwriter usages.

I've also tried to play with the response in order to nagivate it to an application/pdf response.

Any ideas why I am doing wrong?

=======================================================

EDIT

Thanks to @thehennyy,

here is the working solution:

 var unixTimestamp = (long)DateTime.UtcNow.Subtract(DateTime.UnixEpoch).TotalSeconds;

HttpClientHandler handler = new HttpClientHandler()
{
    AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate
};

using (var httpClient = new HttpClient(handler))
{
    using (var request =
           new HttpRequestMessage(new HttpMethod("POST"), "https://flood-map-for-planning.service.gov.uk/pdf"))
    {
        request.Headers.TryAddWithoutValidation("Referer",
            "https://flood-map-for-planning.service.gov.uk/flood-zone-results?easting=429240&northing=431613&location=LS118TR");

        request.Content =
            new StringContent($"id={unixTimestamp}&polygon=&center=[429240,431613]&reference=&scale=2500");
        request.Content.Headers.ContentType = MediaTypeHeaderValue.Parse("application/x-www-form-urlencoded");

        var response = await httpClient.SendAsync(request);

        if (response.IsSuccessStatusCode)
        {
            using (FileStream fs = new FileStream("somepdf.pdf", FileMode.Create))
            {
                var contentStream = await response.Content.ReadAsStreamAsync();
                await contentStream.CopyToAsync(fs);
            }
        }
    }
}

Solution

  • There are a few things to consider here:

    It seems like the curl to httpclient converter had a problem converting the post content. The following works for me:

    request.Content = new StringContent("id=1&polygon=&center=[429240,431613]&reference=&scale=2500");
    request.Content.Headers.ContentType = MediaTypeHeaderValue.Parse("application/x-www-form-urlencoded");
    

    The parameter id has to be provided, otherwise the request will fail. The website uses the current unix timestamp as value for the id parameter.


    Adding headers to the response response.Content.Headers.Add([...]) is not meaningful, just delete these lines.


    Writing the content to disk can be done simpler:

    using (FileStream fs = new FileStream("somepdf.pdf", FileMode.Create))
    {
        var contentStream = await response.Content.ReadAsStreamAsync();
        await contentStream.CopyToAsync(fs);
    }
    

    While testing i got the same "wrong" files, these are usual just html responses, sometimes containing an error message. View them as html. Maybe they seem like gibberish, then you have to turn on automatic decompression:

    HttpClientHandler handler = new HttpClientHandler()
    {
        AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate
    };
    
    var httpClient = new HttpClient(handler);
    

    The automatic decompression values should match this headers values:

    request.Headers.TryAddWithoutValidation("Accept-Encoding", "gzip, deflate");
    

    Current versions of dotnet also support "br" - DecompressionMethods.Brotli. Using automatic decompression is helpful in nearly every case.