Search code examples
htmlpdfdompdfword-wrap

Text overflowing tables when generating PDF with dompdf


alt text

I am generating some PDFs with dompdf, which contains some text and images in a table. But if the text has a large URL in it, the URL wraps all the way to the end of the line. All the text and URL are wrapped in a div with fixed width and height, yet the URL still overflows.

The same HTML rendered in the browser seems to be OK.

Any thoughts?


Solution

  • I believe DOMPDF is using a fairly limited character set for determining how to split a line. Right now it only splits a line at a dash or a space. So something like the URL you have in your sample is going to run past the width of the container. DOMPDF just doesn't know how to break it up.

    Starting with dompdf v0.6.0 you can style your text so that words are broken at any character, e.g.:

    <span style="word-wrap: break-word;">http://example.com/really/long/.../url</span>
    

    It's not as clean as breaking on a particular character (e.g. a /). If you're comfortable hacking the code you can work around the problem a little more elegantly. Open up the text reflower class and modify the regular expression that splits the line. The regular expression looks like the following:

    preg_split('/([\s-]+)/u', $text, -1, PREG_SPLIT_DELIM_CAPTURE)
    

    Modify that code to include whatever extra characters you think will make for a good line break. You might, for example, break URLs up on ?, &, or even / if you expect to have extremely long URLs in your text:

    $words = preg_split('/([\s-\?\&\/]+)/u', $text, -1, PREG_SPLIT_DELIM_CAPTURE);
    

    In dompdf 0.6.1 the RegEx can be found in dompdf/include/text_frame_reflower.cls.php lines 86 and 371. In the upcoming 0.7.0 the RegEx can be found in dompdf/src/FrameReflower/Text.php lines 106 and 402.

    The drawback to modifying the RegEx is that this will affect all text (not just URLs).