I am validating a PDF upload via the header info. I have outputted the header portion that i need to check and that return value is %PDF-1.4
I know there are several ways to do this, i can use strpos or i can use substr to pull 0 thru 3 and check for %PDF to validate it.
My question is do all PDF file headers contain %PDF within the first 7 positions. I am pulling 7 because i need the extra to also check for other file types i am validating.
Is %PDF version specific or is that global? I know i could just use strpos for PDF but if i can be more specific then my validation will be more valid. Actually if all of them were %PDF-1 that would be more specific as well and more valid.
My question is do all PDF file headers contain %PDF within the first 7 positions
Yes, they do. The first 4 characters are always %PDF
as of PDF version 1.7, and this seems unlikely to change in the future. Refer to section 3.4.1 of the PDF Format Specification document (page 92):
The first line of a PDF file is a header identifying the version of the PDF specification to which the file conforms. For a file conforming to PDF 1.7, the header should be
%PDF−1.7
However, since any file conforming to an earlier version of PDF also conforms to version 1.7, an application that processes PDF 1.7 can also accept files with any of the following headers:
%PDF−1.0
%PDF−1.1
%PDF−1.2
%PDF−1.3
%PDF−1.4
%PDF−1.5
%PDF−1.6