Search code examples
htmlpdfboxparagraph

How can I create fixed-width paragraphs with PDFbox?


I can insert simple text like this:

document = new PDDocument();
page = new PDPage(PDPage.PAGE_SIZE_A4);
document.addPage(page);
PDPageContentStream content = new PDPageContentStream(document, page);
content.beginText();
content.moveTextPositionByAmount (10 , 10);
content.drawString ("test text");
content.endText();
content.close();

but how can I create a paragraph similar to HTML using the width attribute?

<p style="width:200px;">test text</p>

Solution

  • Warning: this answer applies to and old version of PDFBox and relies on features that has since been deprecated. See the comments below for more details.

    According to this answer it's not possible to insert line breaks into some text and have PDF display it correctly (whether using PDFBox or something else), so I believe auto-wrapping some text to fit in some width may also be something it can't do automatically. (besides, there are many ways to wrap a text - whole words only, break them in smaller parts, etc)

    This answer to another question (about centering a string) gives some pointers on how to do this yourself. Assuming you wrote a function possibleWrapPoints(String):int[] to list all points in the text a word wrap can happen (excluding "zero", including "text length"), one possible solution could be:

    PDFont font = PDType1Font.HELVETICA_BOLD; // Or whatever font you want.
    int fontSize = 16; // Or whatever font size you want.
    int paragraphWidth = 200;
    String text = "test text";
    
    int start = 0;
    int end = 0;
    int height = 10;
    for ( int i : possibleWrapPoints(text) ) {
        float width = font.getStringWidth(text.substring(start,i)) / 1000 * fontSize;
        if ( start < end && width > paragraphWidth ) {
            // Draw partial text and increase height
            content.moveTextPositionByAmount(10 , height);
            content.drawString(text.substring(start,end));
            height += font.getFontDescriptor().getFontBoundingBox().getHeight() / 1000 * fontSize;
            start = end;
        }
        end = i;
    }
    // Last piece of text
    content.moveTextPositionByAmount(10 , height);
    content.drawString(text.substring(start));
    

    One example of possibleWrapPoints, that allow wrapping at any point that's not part of a word (reference), could be:

    int[] possibleWrapPoints(String text) {
        String[] split = text.split("(?<=\\W)");
        int[] ret = new int[split.length];
        ret[0] = split[0].length();
        for ( int i = 1 ; i < split.length ; i++ )
            ret[i] = ret[i-1] + split[i].length();
        return ret;
    }
    

    Update: some additional info:

    • The PDF file format was designed to look the same in different situations, functionality like the one you requested makes sense in a PDF editor/creator, but not in the PDF file per se. For this reason, most "low level" tools tend to concentrate on dealing with the file format itself and leave away stuff like that.

    • Higher level tools OTOH usually have means to make this conversion. An example is Platypus (for Python, though), that do have easy ways of creating paragraphs, and relies on the lower level ReportLab functions to do the actual PDF rendering. I'm unaware of similar tools for PDFBox, but this post gives some hints on how to convert HTML content to PDF in a Java environment, using freely available tools. Haven't tried them myself, but I'm posting here since it might be useful (in case my hand-made attempt above is not enough).