We have some legacy code that I am struggling to understand - the original author is gone.
Apparently, Ghostscript PS to PDF conversion is very slow for certain files but putting object definitions as below speeds it up immensely (we are talking 8+ hours down to 8.5 minutes for a ~20,000 page file whereas Adobe Distiller takes ~20 minutes on the original with default options).
Original file extract (created with PReS):
/@GP
{
save exch mark exch
execform
cleartomark restore
} bd
...
gsave 0.62 0.62 scale @TestGraphic @GP grestore
Where @TestGraphic is an EPS image. This doesn't seem important as other programs use different non-EPS images with similar issues.
Modified file:
[/_objdef {new_graphic} /BBox [0 0 595 842] /BP pdfmark
@TestGraphic @GP
[/EP pdfmark
...
gsave 0.62 0.62 scale
[ {new_graphic} /SP pdfmark
grestore
We have seen similar behaviour on both Unix and Windows over various gs
versions. The timings were conducted with fairly standard options:
"c:\Program Files (x86)\gs\gs9.21\bin\gswin32c" \
-dNOPAUSE -dBATCH -sDEVICE=pdfwrite -o test.pdf test.ps
I'm not too interested in diagnosing why it is slow (it would take a long time to remove sensitive data from the file but I can try if it's really needed) but rather what benefit the objection definition and pdfmark commands provide.
The original script referenced http://wwwimages.adobe.com/content/dam/Adobe/en/devnet/postscript/pdfs/5113.Forms.pdf and was designed to enable execform caching on certain printer RIPs and Distiller (which each struggled with large full-colour images), however Ghostscript didn't support execform caching so this alternate pdfmark technique was adopted, with no notes on why.
Edit: Added gist of form definition with data removed:
https://gist.github.com/anonymous/676924d451188276053b9b472279e382
You are most likely using older versions of Ghostscript. Your original fragment uses PostScript forms, which is unusual, and older versions of the pdfwrite device did not preserve forms as forms, but instead 'unrolled' the form definition each time it was used.
This will, unsurprisingly, result in much larger output files, especially if, as is likely, the form is the majority of the content and is used on every page.
The pdfmark code defines a PDF object, and then references that object each time. Only one object, so the file is much smaller, so you spend much less time assembling it, and copying identical data 20,000 times.
Of course, new versions of the pdfwrite device will preserve forms, so most likely the benefit of creating and referencing the PDF object directly is long gone.
Its nothing to do with caching of forms (not 'execform caching'), its to do with whether a form in the input PostScript is preserved as a form in the output PDF or not.
By the way, it is important to understand why the performance is poor, wihtout that you can't possibly understand why another approach is faster.