Search code examples
phppdfpdflib

pdflib - PHP: Inserting PDI object with background color as a single object/group


Currently using an older (and recent) version of pdflib (7.0 and 9.2). I'd like a possible solution to work with 7. We have an application where we compose a single PDF file from multiple smaller PDF files. These PDF files may not have an explicit background object set, so they're inserted with transparency active (i.e. the background shines through).

The objects are inserted by using PDI ($p refers to the parent document):

$pdf_doc = PDF_open_pdi_document($p, $file, "");
$image = PDF_open_pdi_page($p, $pdf_doc, 1, "");
PDF_fit_pdi_page($p, $image, $x, $y, $boxsize . " position 50 fitmethod meet");
PDF_close_pdi_page($p, $image);

This works fine as long as the background is as expected (i.e. white). In those cases where someone wants to have a different background (either a different PDF file or a different color), we add a white box with the same boxsize first, then draw the PDF document over it - effectively creating a white background for that single inserted object.

function draw_box($pdf, $offset_x, $offset_y, $width, $height) {
    PDF_setcolor($pdf, 'fill', 'cmyk', 0, 0, 0, 0);
    PDF_rect($pdf, $offset_x, $offset_y, $width, $height);
    PDF_fill($pdf);
}

This works fine for viewing. The problem comes when someone wants to edit the resulting PDF later in Adobe Acrobat or Adobe Illustrator - the box that has been drawn as the background is not grouped together with the rest of the PDF content, making it harder to work with - you have to make sure you're also moving the white box behind the inserted PDF file.

I'd like to work around this without having to insert an explicit background object in all the source PDFs as that is not really a viable strategy because of the number of source PDFs.

I've tried working around the issue by creating a new PDF document, drawing the white box inside this document, then inserting the PDF into this document again. This seems to require writing the pdf to disk and then loading it instead, something I'd like to avoid for performance reasons. The documentation says a "virtual pdf file" can be used, but I haven't been able to find any references to this in the pdflib documentation. The code below barfs when I try to create a PDI document from the in-memory PDF.

$inserted = PDF_new();
PDF_begin_document($inserted, "", '');
$inserted_page = PDF_begin_page_ext($inserted, 20, 20, '');

$pdf_doc = PDF_open_pdi_document($inserted, $file, "");
$image = PDF_open_pdi_page($inserted, $pdf_doc, 1, "");
PDF_fit_pdi_page($inserted, $image, $x, $y, $boxsize . " position 50 fitmethod meet");
PDF_close_pdi_page($inserted, $image);

// then create a PDI document to insert into the parent
// This barfs, since it expects a file.
$new = PDF_open_pdi_document($p, $inserted, "");

I've also tried drawing directly to the PDI document, but this resulted in a segmentation fault. I sadly don't have the code for that attempt available any longer.

So any suggested solutions for either how to get a white background color as the default in the inserted PDF through PDI, or for merging the drawn box with the object inserted by PDI?


Solution

  • The main problem is the following:

    This works fine for viewing. The problem comes when someone wants to edit the resulting PDF later in Adobe Acrobat or Adobe Illustrator - the box that has been drawn as the background is not grouped together with the rest of the PDF content, making it harder to work with - you have to make sure you're also moving the white box behind the inserted PDF file

    Since PDF is a final format, it is not intended for later editing. Therefore no "groupings" or other logical editing information is available.

    So manipulating PDF files will never work reliably as if you had a document format designed for it.

    Therefore it depends on the application whether it recognizes PDF elements as a group or not. When using a current Acrobat DC version, I was not able to move an entire imported page as a single object. It offers me several smaller objects to move.

    => I would not recommend editing PDF files.

    But from your description, it seems that the Acrobat/Illustrator versions you use treat an XObject as a single object that you can move. If this assumption is correct, you could encapsulate the white rectangle and the PDI page in a template. This workaround might work for your current versions, but might not work in later versions.

    For a detailed introduction to this feature see the PDFlib 9.2 tutorial, chapter 3.2.4 "Templates (Form XObjects)", and its use is also demonstrated in the "repeated content" example in the PDFlib Cookbook. Templates are also available in the outdated PDFlib 7, but have been extended in the last decade.

    About PVF: The use of PVF is demonstrated in the starter_pvf example which is included in the PDFlib 7 and 9 download packages. (and available within the PDFlib cookbook "starter_pvf") In your case you should create the first document in memory and retrieve the data with get_buffer(). For the new document, create a new PVF file with a new name and the contents of get_buffer(). Then open this file with open_pdi_document(). In this case, you do not have any files on disc.