Search code examples
phppdfslim-3phpwkhtmltopdf

How to get pdf created with PHP WkHtmlToPdf to download with slim 3 without encoding breaking


I am creating a pdf from html using php wkhtmltopdf

If I save the file directly to my server, the pdf works

$pdf->saveAs('/path/to/mypdf.pdf');

If I convert it to a string first and then save it to file with php the pdf file still works.

$content = $pdf->toString();
$file = '/path/to/mypdf.pdf';
file_put_contents($file, $content);

But no matter what headers I have tried to include, the file downloaded pdf file does not work, and when I inspect it, it is full of � symbols (which as far as I know, usually comes from incorrect encoding)

I have tried every header configuration I can think of, and I still get the same result:

Here are a couple of my attempts:

$response = $response->withHeader( 'Content-type', 'application/pdf' );
$content = $pdf->toString();
$response->write($content);

and

$response = $response->withHeader('Content-Type', 'application/pdf');
$response = $response->withHeader('Pragma', "public");
$response = $response->withHeader('Content-disposition', 'attachment; filename=test.pdf');
$response = $response->withHeader('Content-Transfer-Encoding', 'binary');
$response = $response->withHeader('Content-Length', strlen($content));
$response = $response->write($content);
return $response;

Here is a sample of content from the pdf files before and after downloading:

before download:

%PDF-1.4
1 0 obj
<<
/Title (˛ˇMyPDF Test type1)
/Creator (˛ˇwkhtmltopdf 0.12.3)
/Producer (˛ˇQt 4.8.7)
/CreationDate (D:20170312220641+02'00')
>>
endobj
3 0 obj
<<
/Type /ExtGState
/SA true
/SM 0.02
/ca 1.0
/CA 1.0
/AIS false
/SMask /None>>
endobj
4 0 obj
[/Pattern /DeviceRGB]
endobj
7 0 obj
<<
/Type /XObject
/Subtype /Image
/Width 301
/Height 181
/BitsPerComponent 8
/ColorSpace /DeviceGray
/Length 8 0 R
/Filter /FlateDecode
>>
stream
xúÌ]w@Gfl£É†R,®à
å
[DçöœäΩ˜D£±ã[å∆ª£bÕg"v
®{/ÿQ∞+E§fi|ª;≥w{ª≥ºF>˘˝°∑;˝«ÓÏõ˜fiº!,|ÇÍU∂!§·P¡ªÇgÖåú$,}ªœ€uÒ—ÎåÔ/Ôûfl∑í£ïºí•ÇÇ]%õ)·)’á*}~=r˜mFF∆ãQ·≠‹Kà’V=®Æµk÷®^›fløJï ï¸|+VÙˆ™‡QäPt)$"Ã$a˝ï1©ëdFn_ÕøA◊´ÜÚˆ‚&n2äñ9FÂæ]M"õÂ&∫7
3îìØ—ÅîΩ}*˝ΩkÄ4Nªÿ¨£ƒ9Ké"(ôŒπL2#·3‰:æπ+c|$◊áYJd´Ùê˛¸"ê\z¿CL˚o‘≤ƒf∑;#É,–ÿv3˝ˇ©2í£h¯úÓfiJ©|«>n/i™]2…™˙Ä U∏üÍ∑W†˝¸≈˛∏¸•œ ‡JŸ“.J.Y
 YøäÁ≤
ª-fi‰ù‚Û£vd-∆&÷{ ‹˛ãŒòŒÁeêZñÿ≠W≤*˝ÆYJÚ£áI)ö˜¢™ã’†≤ü©⁄zÛ‘—cßn`5ˇ‹Ö_¬ı¢Ær˝K¸©O≤Bo©ÎV^Ÿ1(¥A`@˝–Å≥7_»Sß<j-RÖÓdπ›Ảò‹ õ|ém|€GD´¯zTñ_§‹e:)a√˙
ÇX€S·pPèd
»PÚ`EXvRŸNKÆ©sF

after download:

PDF-1.4

1 0 obj
<<
/Title (���M�y�P�D�F� �T�e�s�t� �t�y�p�e�1)
/Creator (���w�k�h�t�m�l�t�o�p�d�f� �0�.�1�2�.�3)
/Producer (���Q�t� �4�.�8�.�7)
/CreationDate (D:20170312222109+02'00')
>>
endobj
3 0 obj
<<
/Type /ExtGState
/SA true
/SM 0.02
/ca 1.0
/CA 1.0
/AIS false
/SMask /None>>
endobj
4 0 obj
[/Pattern /DeviceRGB]
endobj
7 0 obj
<<
/Type /XObject
/Subtype /Image
/Width 301
/Height 181
/BitsPerComponent 8
/ColorSpace /DeviceGray
/Length 8 0 R
/Filter /FlateDecode
>>
stream
x��]w@Gߣ��R,��
�
[D��ϊ��D���[����b�g"v
�{/�Q�+E��|�;�w{���F>����;�����޼!,|��U�!��P���g���$,}���u����/

�߷��������]%�)�)��*}~=r�mFFƋQ��K��V=���k�
n2��9F�]M"��&�7
3���с��}*��k�4N�ج��9K�"(�ιL2#�3�:��+c|$ׇYJd�����"�\z�CL�oԲ�f�;#�,��v3���2��h����J�|�>n/i�]2ɪ���U���W�����������J��.J.Y
 Y���
�-����vd-�&�{ ���Θ��e�Z�حW�*��YJ�I)�����ՠ����z���c�n`5�܅_����r�K��O�Bo��V^�1(�A`@�Ё�7_�S�<j-R��d��A����ʛ|�m|�GD��zT�_��e:)a��

Solution

  • One thing I noticed was that your PDF does not have the recommended binary comment under the header line.

    From the PDF reference:

    Note: If a PDF file contains binary data, as most do (see Section 3.1, “Lexical Con- ventions”), it is recommended that the header line be immediately followed by a comment line containing at least four binary characters—that is, characters whose codes are 128 or greater. This will ensure proper behavior of file transfer applications that inspect data near the beginning of a file to determine whether to treat the file’s contents as text or as binary.

    eg.

    %PDF-1.4
    %����
    

    That's written in node as %\xFF\xFF\xFF\xFF, I'm not sure what the equivalent in PHP would be.

    I don't see any options in phpwkhtmltopdf or wkhtmltopdf that seem directly related to this, but there was 'encoding' => 'UTF-8' as an option to the shell command.

    I would recommend exploring the options in this area as it certainly appears that something between the server and you is corrupting the unicode in the file.