I can't get a PDF/A-1a (not even PDF/A-1b according to pdfbox preflight) conforming PDF with metadata with FOP 2.1.
Let's say I want to set date, language, title and description:
<fo:declarations xmlns:fo="http://www.w3.org/1999/XSL/Format" xmlns:pdfaid="http://www.aiim.org/pdfa/ns/id/"
xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xml:lang="de">
<x:xmpmeta xmlns:x="adobe:ns:meta/" id="hc_meta">
<rdf:RDF>
<rdf:Description xmlns:xmp="http://ns.adobe.com/xap/1.0/" rdf:about="">
<xmp:CreatorTool>hx</xmp:CreatorTool>
<dc:language>
<rdf:Bag>
<rdf:li>de</rdf:li>
</rdf:Bag>
</dc:language>
<dc:title>
<rdf:Alt>
<rdf:li xml:lang="de">Schrieb 2016-003 - Dings AG</rdf:li>
</rdf:Alt>
</dc:title>
<dc:creator>
<rdf:Seq>
<rdf:li>hxxxdingens Consulting GmbH, Rodger Moore</rdf:li>
</rdf:Seq>
</dc:creator>
<dc:description>
<rdf:Alt>
<rdf:li xml:lang="de">Schrieb 2016-003 - Dings AG XXX R 7 99 3 - 2016-06-30 (2016:06:30)</rdf:li>
</rdf:Alt>
</dc:description>
<dc:date>
<rdf:Seq>
<rdf:li>2016:06:30</rdf:li>
</rdf:Seq>
</dc:date>
</rdf:Description>
</rdf:RDF>
</x:xmpmeta>
</fo:declarations>
Then the output will not conform:
$ java -jar ~/prog/hcbriefe/preflight-app-2.0.2.jar test_1.pdf
The file test_1.pdf is not valid, error(s) :
7.2 : Error on MetaData, Title present in the document catalog dictionary can't be found in XMP information (Property is not defined)
7.2 : Error on MetaData, Subject present in the document catalog dictionary can't be found in XMP information (Subject not found in XMP (dc:description["x-default"] not found))
But when I call exiftool to set title and description on the PDF, it will pass this test:
$ cp test_1.pdf test_1mod.pdf
$ exiftool -title="Schrieb 2016-003 - Dings AG" \
-description="Schrieb 2016-003 - Dings AG XXX R 7 99 3 - 2016-06-30 (2016:06:30)" \
test_1mod.pdf
1 image files updated
$ java -jar ~/prog/hcbriefe/preflight-app-2.0.2.jar test_1mod.pdf
The file test_1mod.pdf is a valid PDF/A-1b file
What do I have to put in the fo metadata to make it conforming out-of-the-box or straight out of FOP?
After some comparing I found out. The language in the description and title elements may not be set to de
but must be set to x-default
like
...
<dc:title>
<rdf:Alt>
<rdf:li xml:lang="x-default">Schrieb 2016-003 - Dings AG</rdf:li>
</rdf:Alt>
</dc:title>
...
<dc:description>
<rdf:Alt>
<rdf:li xml:lang="x-default">Schrieb 2016-003 - Dings AG XXX R 7 99 3 - 2016-06-30 (2016:06:30)</rdf:li>
</rdf:Alt>
</dc:description>
<dc:date>
<!-- some validators will complain if date has : instead of - !! -->
<rdf:Seq>
<rdf:li>2016-06-30</rdf:li>
</rdf:Seq>
</dc:date>
...
Then it will pass the pdfbox preflight test.
Additionally, date must have -
separators between y, m, d to conform with the online pdf-tools.com validator.