I have a string including different kinds of html tags.
I want to remove all <a>
and </a>
tags.
I tried:
string.replaceAll("<a>", "");
string.replaceAll("</a>", "");
But it doesn't work. Those tags still remain in the string. Why?
Those tags still remain in the string. Why?
Because replaceAll
doesn't modify the string directly (it can't, strings are immutable), it returns the modified string. So:
string = string.replaceAll("<a>", "");
string = string.replaceAll("</a>", "")
Or
string = string.replaceAll("<a>", "").replaceAll("</a>", "")
Note that replaceAll
takes a string defining a regular expression as its first argument. "<a>"
and "</a>"
are both fine, but unless you need to use a regular expression, use replace(CharSequence,CharSequence)
instead. If using replaceAll
, just be aware of the characters with special meaning in regular expressions.
In fact, you can do it with one replaceAll
by making use of the fact you're using regular expressions:
string = string.replaceAll("</?a>", "");
The ?
after the /
makes the /
optional, so that'll replace "<a>"
and "</a>"
.