We allow the user to create some text that will get converted to HTML, using a rich-text editor library (called Android-RTEditor).
The output HTML text is saved as is on the server and the device.
Because on some end cases, there is a need to show a lot of this content (multiple instances), we wish to also save a "preview" version of this content, meaning it will be much shorter in length (say 120 of normal characters, excluding the extra characters for the HTML tags, which are not counted).
What we want is a minimized version of the HTML. Some tags might optionally be removed, but we still want to see lists (numbered/bullets), no matter what we choose to do, because lists do show like text to the user (the bullet is a character, and so do the numbers with the dot).
The tag of going to next line should also be handled , as it's important to go to the next line.
As opposed to a normal string, where I can just call substring
with the required number of characters, on HTML it might ruin the tags.
I've thought of 2 possible solutions for this:
Convert to plain text (while having some tags handled), and then truncate : Parse the HTML, and replacing some tags with Unicode alternatives, while removing the others. For example, instead of a bullet-list, put the bullet character (maybe this), and same for numbered list (put numbers instead). All the other tags would be removed. Same goes for the tag of going to the next line ("
"), which should be replaced with "\n". After that, I could safely truncate the normal text, because there are no more tags that could be ruined.
Truncate nicely inside the HTML : Parse the HTML, while identifying the text within it, and truncate it there and closing all tags when reaching the truncation position. This might even be harder.
I'm not sure which is easier, but I can think of possible disadvantages for each. It is just a preview though, so I don't think it matters much.
I've searched the Internet for such solutions, to see if others have made it. I've found some links that talk about "cleaning" or "optimizing" HTML, but I don't see they can handle replacing them or truncating them. Not only that, but since it's HTML, most are not related to Android, and use PHP, C#, Angular and others as their language.
Here are some links that I've found:
Are those solutions that I've written possible? If so, is there maybe a known way to implement them? Or even a Java/Kotlin/Android library? How hard would it be to make such a solution?
Maybe other solution I haven't thought about?
EDIT: I've also tried using an old code I've made in the past (here), which parses XML. Maybe it will work. I also try now to investigate some third party libraries for parsing HTML, such as Jsoup. I think it can help with the truncating, while supporting "faulty" HTML inputs.
OK, I think I got it, using my old code for converting XML string into an object . It would still be great to see more robust solutions, but I think what I got is good enough, at least for now.
Below code uses it (origininal XmlTag class available here) :
XmlTagTruncationHelper.kt
object XmlTagTruncationHelper {
/**@param maxLines max lines to permit. If <0, means there is no restriction
* @param maxTextCharacters max text characters to permit. If <0, means there is no restriction*/
class Restriction(val maxTextCharacters: Int, val maxLines: Int) {
var currentTextCharactersCount: Int = 0
var currentLinesCount: Int = 0
}
@JvmStatic
fun truncateXmlTag(xmlTag: XmlTag, restriction: Restriction): String {
if (restriction.maxLines == 0 || (restriction.maxTextCharacters >= 0 && restriction.currentTextCharactersCount >= restriction.maxTextCharacters))
return ""
val sb = StringBuilder()
sb.append("<").append(xmlTag.tagName)
val numberOfAttributes = if (xmlTag.tagAttributes != null) xmlTag.tagAttributes!!.size else 0
if (numberOfAttributes != 0)
for ((key, value) in xmlTag.tagAttributes!!)
sb.append(" ").append(key).append("=\"").append(value).append("\"")
val numberOfInnerContent = if (xmlTag.innerTagsAndContent != null) xmlTag.innerTagsAndContent!!.size else 0
if (numberOfInnerContent == 0)
sb.append("/>")
else {
sb.append(">")
for (innerItem in xmlTag.innerTagsAndContent!!) {
if (restriction.maxTextCharacters >= 0 && restriction.currentTextCharactersCount >= restriction.maxTextCharacters)
break
if (innerItem is XmlTag) {
if (restriction.maxLines < 0)
sb.append(truncateXmlTag(innerItem, restriction))
else {
// Log.d("AppLog", "xmlTag:" + innerItem.tagName + " " + innerItem.innerTagsAndContent?.size)
var needToBreak = false
when {
innerItem.tagName == "br" -> {
++restriction.currentLinesCount
needToBreak = restriction.currentLinesCount >= restriction.maxLines
}
innerItem.tagName == "li" -> {
++restriction.currentLinesCount
needToBreak = restriction.currentLinesCount >= restriction.maxLines
}
}
if (needToBreak)
break
sb.append(truncateXmlTag(innerItem, restriction))
}
} else if (innerItem is String) {
if (restriction.maxTextCharacters < 0)
sb.append(innerItem)
else
if (restriction.currentTextCharactersCount < restriction.maxTextCharacters) {
val str = innerItem
val extraCharactersAllowedToAdd = restriction.maxTextCharacters - restriction.currentTextCharactersCount
val strToAdd = str.substring(0, Math.min(str.length, extraCharactersAllowedToAdd))
if (strToAdd.isNotEmpty()) {
sb.append(strToAdd)
restriction.currentTextCharactersCount += strToAdd.length
}
}
}
}
sb.append("</").append(xmlTag.tagName).append(">")
}
return sb.toString()
}
}
XmlTag.kt
//based on https://stackoverflow.com/a/19115036/878126
/**
* an xml tag , includes its name, value and attributes
* @param tagName the name of the xml tag . for example : <a>b</a> . the name of the tag is "a"
*/
class XmlTag(val tagName: String) {
/** a hashmap of all of the tag attributes. example: <a c="d" e="f">b</a> . attributes: {{"c"="d"},{"e"="f"}} */
@JvmField
var tagAttributes: HashMap<String, String>? = null
/**list of inner text and xml tags*/
@JvmField
var innerTagsAndContent: ArrayList<Any>? = null
companion object {
@JvmStatic
fun getXmlFromString(input: String): XmlTag? {
val factory = XmlPullParserFactory.newInstance()
factory.isNamespaceAware = true
val xpp = factory.newPullParser()
xpp.setInput(StringReader(input))
return getXmlRootTagOfXmlPullParser(xpp)
}
@JvmStatic
fun getXmlRootTagOfXmlPullParser(xmlParser: XmlPullParser): XmlTag? {
var currentTag: XmlTag? = null
var rootTag: XmlTag? = null
val tagsStack = Stack<XmlTag>()
xmlParser.next()
var eventType = xmlParser.eventType
var doneParsing = false
while (eventType != XmlPullParser.END_DOCUMENT && !doneParsing) {
when (eventType) {
XmlPullParser.START_DOCUMENT -> {
}
XmlPullParser.START_TAG -> {
val xmlTagName = xmlParser.name
currentTag = XmlTag(xmlTagName)
if (tagsStack.isEmpty())
rootTag = currentTag
tagsStack.push(currentTag)
val numberOfAttributes = xmlParser.attributeCount
if (numberOfAttributes > 0) {
val attributes = HashMap<String, String>(numberOfAttributes)
for (i in 0 until numberOfAttributes) {
val attrName = xmlParser.getAttributeName(i)
val attrValue = xmlParser.getAttributeValue(i)
attributes[attrName] = attrValue
}
currentTag.tagAttributes = attributes
}
}
XmlPullParser.END_TAG -> {
currentTag = tagsStack.pop()
if (!tagsStack.isEmpty()) {
val parentTag = tagsStack.peek()
parentTag.addInnerXmlTag(currentTag)
currentTag = parentTag
} else
doneParsing = true
}
XmlPullParser.TEXT -> {
val innerText = xmlParser.text
if (currentTag != null)
currentTag.addInnerText(innerText)
}
}
eventType = xmlParser.next()
}
return rootTag
}
/**returns the root xml tag of the given xml resourceId , or null if not succeeded . */
fun getXmlRootTagOfXmlFileResourceId(context: Context, xmlFileResourceId: Int): XmlTag? {
val res = context.resources
val xmlParser = res.getXml(xmlFileResourceId)
return getXmlRootTagOfXmlPullParser(xmlParser)
}
}
private fun addInnerXmlTag(tag: XmlTag) {
if (innerTagsAndContent == null)
innerTagsAndContent = ArrayList()
innerTagsAndContent!!.add(tag)
}
private fun addInnerText(str: String) {
if (innerTagsAndContent == null)
innerTagsAndContent = ArrayList()
innerTagsAndContent!!.add(str)
}
/**formats the xmlTag back to its string format,including its inner tags */
override fun toString(): String {
val sb = StringBuilder()
sb.append("<").append(tagName)
val numberOfAttributes = if (tagAttributes != null) tagAttributes!!.size else 0
if (numberOfAttributes != 0)
for ((key, value) in tagAttributes!!)
sb.append(" ").append(key).append("=\"").append(value).append("\"")
val numberOfInnerContent = if (innerTagsAndContent != null) innerTagsAndContent!!.size else 0
if (numberOfInnerContent == 0)
sb.append("/>")
else {
sb.append(">")
for (innerItem in innerTagsAndContent!!)
sb.append(innerItem.toString())
sb.append("</").append(tagName).append(">")
}
return sb.toString()
}
}
Sample usage:
build.grade
compileOptions {
sourceCompatibility JavaVersion.VERSION_1_8
targetCompatibility JavaVersion.VERSION_1_8
}
...
dependencies{
implementation 'com.1gravity:android-rteditor:1.6.7'
...
}
...
MainActivity.kt
class MainActivity : AppCompatActivity() {
override fun onCreate(savedInstanceState: Bundle?) {
super.onCreate(savedInstanceState)
setContentView(R.layout.activity_main)
// val inputXmlString = "<zz>Zhshs<br/>ABC</zz>"
val inputXmlString = "Aaa<br/><b>Bbb<br/></b>Ccc<br/><ul><li>Ddd</li><li>eee</li></ul>fff<br/><ol><li>ggg</li><li>hhh</li></ol>"
// XML must have a root tag
val xmlString = if (!inputXmlString.startsWith("<"))
"<html>$inputXmlString</html>" else inputXmlString
val rtApi = RTApi(this, RTProxyImpl(this), RTMediaFactoryImpl(this, true))
val mRTManager = RTManager(rtApi, savedInstanceState)
mRTManager.registerEditor(beforeTruncationTextView, true)
mRTManager.registerEditor(afterTruncationTextView, true)
beforeTruncationTextView.setRichTextEditing(true, inputXmlString)
val xmlTag = XmlTag.getXmlFromString(xmlString)
Log.d("AppLog", "xml parsed: " + xmlTag.toString())
val maxTextCharacters = 10
val maxLines = 20
val output = XmlTagTruncationHelper.truncateXmlTag(xmlTag!!, XmlTagTruncationHelper.Restriction(maxTextCharacters, maxLines))
afterTruncationTextView.setRichTextEditing(true, output)
Log.d("AppLog", "xml with truncation : maxTextCharacters: $maxTextCharacters , maxLines: $maxLines output: " + output)
}
}
activity_main.xml
<LinearLayout
xmlns:android="http://schemas.android.com/apk/res/android" xmlns:app="http://schemas.android.com/apk/res-auto"
xmlns:tools="http://schemas.android.com/tools" android:layout_width="match_parent"
android:layout_height="match_parent" android:gravity="center" android:orientation="vertical"
tools:context=".MainActivity">
<com.onegravity.rteditor.RTEditText
android:id="@+id/beforeTruncationTextView" android:layout_width="match_parent"
android:layout_height="wrap_content" android:background="#11ff0000" tools:text="beforeTruncationTextView"/>
<com.onegravity.rteditor.RTEditText
android:id="@+id/afterTruncationTextView" android:layout_width="match_parent"
android:layout_height="wrap_content" android:background="#1100ff00" tools:text="afterTruncationTextView"/>
</LinearLayout>
And the result: