Search code examples
phpwhitespacesubstrwords

PHP - Using Substr crop sentence


Here is my code, what im doing is pulling text from a database, displaying in a table and where the data exceeds lets say 450 characters i put this on the end

....[view more]

Now the code works fine but there is one exception, the information in the database has html in it, like paragraphs and bullet lists. That poses a problem, the whole idea of putting a limit is so it doesnt stretch the row down further than i want it to go, a line break for a bullet list or a paragraph seems to be counted as 0 or 1 charatcers but it takes up the space of a lot of characters so how can i manipulate this code so that linebreaks are accounted for.

My ideas are to count the whitespace between with something like this:

$white_space = substr_count($text, ' ');

Which returns the total whitespace

I also tried this

$white_space_str = substr_count($newstr, ' ');

But that returns 0 so im doing something wrong. But in any case im a bit stuck at this point and hoping someone can help out a newbie, if the code is simplified rather than trimmed and neat it might help me understand it better :)

But im not sure how to put that into a working code.

function trim_description($str, $maxlen) {
if ( strlen($str) <= $maxlen ) return $str;

$newstr = substr($str, 0, $maxlen);
if ( substr($newstr,-1,1) != ' ' ) $newstr = substr($newstr, 0, strrpos($newstr, " "));

return $newstr;
}

Solution

  • Maybe this can help you. I found this as an answer for this question

    function truncate($text, $length, $suffix = '&hellip;', $isHTML = true) { 
        $i = 0; 
        $simpleTags=array('br'=>true,'hr'=>true,'input'=>true,'image'=>true,'link'=>true,'meta'=>true); 
        $tags = array(); 
        if($isHTML){ 
            preg_match_all('/<[^>]+>([^<]*)/', $text, $m, PREG_OFFSET_CAPTURE | PREG_SET_ORDER); 
            foreach($m as $o){ 
                if($o[0][1] - $i >= $length) 
                    break; 
                $t = substr(strtok($o[0][0], " \t\n\r\0\x0B>"), 1); 
                // test if the tag is unpaired, then we mustn't save them 
                if($t[0] != '/' && (!isset($simpleTags[$t]))) 
                    $tags[] = $t; 
                elseif(end($tags) == substr($t, 1)) 
                    array_pop($tags); 
                $i += $o[1][1] - $o[0][1]; 
            } 
        } 
    
        // output without closing tags 
        $output = substr($text, 0, $length = min(strlen($text),  $length + $i)); 
        // closing tags 
        $output2 = (count($tags = array_reverse($tags)) ? '</' . implode('></', $tags) . '>' : ''); 
    
        // Find last space or HTML tag (solving problem with last space in HTML tag eg. <span class="new">) 
        $pos = (int)end(end(preg_split('/<.*>| /', $output, -1, PREG_SPLIT_OFFSET_CAPTURE))); 
        // Append closing tags to output 
        $output.=$output2; 
    
        // Get everything until last space 
        $one = substr($output, 0, $pos); 
        // Get the rest 
        $two = substr($output, $pos, (strlen($output) - $pos)); 
        // Extract all tags from the last bit 
        preg_match_all('/<(.*?)>/s', $two, $tags); 
        // Add suffix if needed 
        if (strlen($text) > $length) { $one .= $suffix; } 
        // Re-attach tags 
        $output = $one . implode($tags[0]); 
    
        //added to remove  unnecessary closure 
        $output = str_replace('</!-->','',$output);  
    
        return $output; 
    }