Search code examples
htmlparsingvbahtml-parsing

Convert html to plain text in VBA


I have an Excel sheet with cells containing html. How can I batch convert them to plaintext? At the moment there are so many useless tags and styles. I want to write it from scratch but it will be far easier if I can get the plain text out.

I can write a script to convert html to plain text in PHP so if you can't think of a solution in VBA then maybe you can sugest how I might pass the cells data to a website and retrieve the data back.


Solution

  • Set a reference to "Microsoft HTML object library".

    Function HtmlToText(sHTML) As String
      Dim oDoc As HTMLDocument
      Set oDoc = New HTMLDocument
      oDoc.body.innerHTML = sHTML
      HtmlToText = oDoc.body.innerText
    End Function
    

    Tim