Search code examples
utf-8elixirhttpoison

httpoison - response body showing garbled text instead of html


If I try:

url = "https://www.economist.com/news/finance-and-economics/21727073-economists-struggle-work-out-how-much-free-economy-comes-cost"    
{:ok, %HTTPoison.Response{status_code: 200, body: body}} = HTTPoison.get(url)
IO.binwrite body

I see garbled text (instead of html) in the console. But if I view source on the webpage, I see html there. What am I doing wrong?

PS: it works fine with a js http client (axios.js), not sure why it doesn't work with httpoison


Solution

  • That URL returns the body in gzipped form and indicates this by sending the header Content-Encoding: gzip. hackney, the library HTTPoison is built on, does not automatically decode this. This feature will likely be added at some point. Until then, you can decode the body yourself using the :zlib module if the Content-Encoding is gzip:

    url = "https://www.economist.com/news/finance-and-economics/21727073-economists-struggle-work-out-how-much-free-economy-comes-cost"
    
    {:ok, %HTTPoison.Response{status_code: 200, headers: headers, body: body}} = HTTPoison.get(url)
    
    gzip? = Enum.any?(headers, fn {name, value} ->
      # Headers are case-insensitive so we compare their lower case form.
      :hackney_bstr.to_lower(name) == "content-encoding" &&
        :hackney_bstr.to_lower(value) == "gzip"
    end)
    
    body = if gzip?, do: :zlib.gunzip(body), else: body
    
    IO.write body