Search code examples
haskellhaskell-tagsoup

How do I extract inner text from HTML markup?


I have the following code:

import Text.HTML.TagSoup

parseTags "<hello>my&amp;</world>" 

which is giving me output like: [TagOpen "hello" [],TagText "my&",TagClose "world"]. But I want to get only [TagText "my&"]. And I can do this:

filter (~== "my&")$ parseTags "<hello>my&amp;</world>"

which will give me output like: [TagText "my&"]. But I do not know what is inside the TagText, i.e. "my&". My ultimate target is to get "my&" which I can get by

map(fromTagText) $ filter (~== "my&")$ parseTags "<hello>my&amp;</world>"

I tried to use TagText, but can’t do it right way.


Solution

  • > filter isTagText (parseTags "<hello>my&amp;</world>")
    [TagText "my&"]