Search code examples
httputf-8kotlincp1251

How convert win1251 encoding to UTF8 inside Kotlin?


I am making HTTP request to page. This page have cyrillic characters. How I can convert answer in CP1251 to UTF8?

Here is my code.

package bash

import com.github.kittinunf.fuel.httpGet
import com.github.kittinunf.result.Result

fun main(args: Array<String>) {
    val bashImHost = "http://bash.im/"
    bashImHost.httpGet().responseString { request, response, result ->
        when (result) {
            is Result.Failure -> {
                println("Some kind of error!")
            }
            is Result.Success -> {
                val htmlBody = result.value
                val parsePattern = "<div class=\"text\">(.+)</div>"
                val parseRegex = Regex(parsePattern)
                val results = parseRegex.findAll(htmlBody)
                results.iterator().forEach { resultItem -> println(resultItem.groups[1]?.value) }
            }
        }
    }
}

I am using Fuel HTTP library.


Solution

  • Use the responseString overload that accepts Charset to make it decode the response using Charset.forName("Windows-1251"):

    bashImHost.httpGet().responseString(Charset.forName("Windows-1251")) {
        request, response, result ->
    
        /* ... */
    }
    

    Seems like you cannot change the encoding of the response to Windows-1251 after it has been converted to a String using the wrong encoding UTF-8, see this Q&A.