Search code examples
javaellipsis

Ideal method to truncate a string with ellipsis


I'm sure all of us have seen ellipsis' on Facebook statuses (or elsewhere), and clicked "Show more" and there are only another 2 characters or so. I'd guess this is because of lazy programming, because surely there is an ideal method.

Mine counts slim characters [iIl1] as "half characters", but this doesn't get around ellipsis' looking silly when they hide barely any characters.

Is there an ideal method? Here is mine:

/**
 * Return a string with a maximum length of <code>length</code> characters.
 * If there are more than <code>length</code> characters, then string ends with an ellipsis ("...").
 *
 * @param text
 * @param length
 * @return
 */
public static String ellipsis(final String text, int length)
{
    // The letters [iIl1] are slim enough to only count as half a character.
    length += Math.ceil(text.replaceAll("[^iIl]", "").length() / 2.0d);

    if (text.length() > length)
    {
        return text.substring(0, length - 3) + "...";
    }

    return text;
}

Language doesn't really matter, but tagged as Java because that's what I'm mostly interested in seeing.


Solution

  • I like the idea of letting "thin" characters count as half a character. Simple and a good approximation.

    The main issue with most ellipsizings however, are (imho) that they chop of words in the middle. Here is a solution taking word-boundaries into account (but does not dive into pixel-math and the Swing-API).

    private final static String NON_THIN = "[^iIl1\\.,']";
    
    private static int textWidth(String str) {
        return (int) (str.length() - str.replaceAll(NON_THIN, "").length() / 2);
    }
    
    public static String ellipsize(String text, int max) {
    
        if (textWidth(text) <= max)
            return text;
    
        // Start by chopping off at the word before max
        // This is an over-approximation due to thin-characters...
        int end = text.lastIndexOf(' ', max - 3);
    
        // Just one long word. Chop it off.
        if (end == -1)
            return text.substring(0, max-3) + "...";
    
        // Step forward as long as textWidth allows.
        int newEnd = end;
        do {
            end = newEnd;
            newEnd = text.indexOf(' ', end + 1);
    
            // No more spaces.
            if (newEnd == -1)
                newEnd = text.length();
    
        } while (textWidth(text.substring(0, newEnd) + "...") < max);
    
        return text.substring(0, end) + "...";
    }
    

    A test of the algorithm looks like this:

    enter image description here