Search code examples
xmlrreadlines

R readlines() font issue


I am using the following code to get text from a website

readLines("http://www.mijnwoordenboek.nl/duits/synoniemen/abartig")[181]

and it reads the first synonym on the url mentioned as "böse" but it should be "böse". Help me to resolve this issue. Thanks in advance.


Solution

  • Try this:

    readLines("http://www.mijnwoordenboek.nl/duits/synoniemen/abartig", encoding="UTF-8")[181]
    

    In the html of the webpage, there is a line that states the charset is "UTF-8":

    <meta charset="UTF-8">
    

    you have to manually specify this parameter in readLines