I am building an android app which has a webview. The webview will display a html document returned from a server.
Depending on a search string i have to highlight few parts of the html document. If search string is 'hello world' then i have to mark text that matches the regex (hello)|(world*).
I tried this -
I get the html document from server. Search the text with regex using Pattern and Matcher. I replace the matched words with which makes it look like highlighted. Works great when there are no html tags. But screws it up when there are html tags in the document from webserver and when my search string matches one of these tags.
I hope i'm clear. Anybody can help?
I recommend using a HTML parser then you only use regex on text nodes in the tree returned by the parser. Regex that would exclude the tags would be very complex, especially considering tags have attributes which can (in name or in value) cause your regex to match (not to mention javascript snippets.
In absence of HTML parser you should try regex:
"<[^>]++>([^<]++)<[^>]++>
and then take group 1 from result and do a replace with hello|world
as regex.