compute bounding box for arbitrary PostScript code, from within PostScript

I want to be able to run some arbitrary PostScript code, and find out what its bounding box is, for the purpose of centering and/or scaling that PostScript code when I run it again. This needs to all happen within a single invocation of the PostScript interpreter. (i. e. happen within a .ps file that I can send to a printer)

Pseudocode:

Run this PostScript code without drawing anything
    <some arbitrary PostScript code>
Get the bounding box of the code that just ran
Do some computations to center it, and/or scale it if it's too big
Run the PostScript code with scaling/translation, and actually draw
    <the same arbitrary PostScript code>

The arbitrary PostScript code might draw multiple paths, so it's not as simple as just calling pathbbox. (Is there a way to combine an arbitrary chunk of PostScript code into a single path, so that pathbbox could be used?)

This seems similar to this question, which unfortunately has remained unanswered since 1992.

An unacceptable answer is any answer which involves invoking the PostScript interpreter more than once. (Such as using the GhostScript bbox device.) This has to run as a normal PostScript program, on any(*) compliant PostScript interpreter.

(*) It's acceptable to require PostScript Level 3, but it's not acceptable to require a specific implementation (e. g. GhostScript)

Solution

This is 'possible' but non-trivial, and the answer depends a great deal on how much accuracy, performance and reliability matter to you, as well as how much time you are prepared to invest in the project and your level of PostScript programming ability.

In theory you could redefine each of the PostScript marking operators (eg stroke, fill, image, all the show variants etc) and instead of drawing the result, determine the area of the page marked by that operation. For some operators (eg rectfill) this is trivial, for others it would be more complex but it is still possible to determine the area marked by any PostScript operation by using charpath and pathforall.

Now to cover your first point; pathbbox itself is not sufficient, because any PostScript drawing operation can be drawn through a clip and a clip is not necessarily a simple rectangle.

Consider this simple example which does use a rectangular clip:

%!

100 100 translate

0 0 0 setrgbcolor
-45 rotate
0 0 moveto
0 100 lineto
50 100 lineto
50 0 lineto
closepath
clip
newpath

90 rotate
0 -50 translate
0 1 0 setrgbcolor
0 0 moveto
0 100 lineto
50 100 lineto
50 0 lineto
closepath
fill

showpage

To help visualise what's going on let's draw the clip as a black stroke. That looks like this:

and renders to this:

If we applied pathboox to the path for the fill it will not apply the clip (and will report it's co-ordinates using the CTM), so the result will be wrong when compared to what is actually rendered (executing pathbbox then undoing the CTM and converting to default user space results in a bbox of 64.64, 64.64, 170.71, 170.71, whereas the actual bbox returned by the bbox device at 72 dpi is 100, 64, 171, 136).

You would have to intersect the current clip with the current path (using clippath to retrieve the current clip and currentpath to retrieve the current path if the path is being drawn) or a rectangular path if executing something like the image operator, and then work out the bounding box of that intersection in order to determine the bounding box of what is rendered.

There are other difficulties; when lines are stroked you need to account for the line width in order to determine the area which is marked, line joins can be mitered and you need to determine where the mitre terminates, curves can extend beyond the edges of a simplistic rectangular bounding box based on their endpoints.

All of these problems are dealt with, when rendering, by scan-converting the path. Basically this means converting the complex shape into a series of filled rectangles. In the limiting case the height of the rectangle is one scan line (ie one pixel high). Clipping is then simply a case of intersecting rectangles.

This does, of course, lead to accuracy limits, because the rectangles are at device resolution. The result of scan converting shallow curves at 72 dpi may not be the same as scan-converting at 720 dpi. This is why the Ghostscript bbox device uses a high resolution.

Now PostScript is a programming language so obviously you can do the scan-conversion and rectangle intersection in PostScript. On the other hand, it is also an interpreted language and so comparatively slow; this is where the performance limitation comes in. For a complex PostScript input, scan-converting and intersecting a list of rectangles could take quite a long time (and indeed use up quite a lot of memory to store the lists of rectangles) if performed in PostScript.

You could, probably, get the current clip path and the current path, run pathbbox on them both and then figure out the intersection of them both. That would give you the maximum x and y. It's probably good enough for your purposes and it doesn't involve decomposing to the device resolution. Note that the bbox will be in current user space so you'll need to reverse the CTM and apply it.

Again, be aware that a stroked path is drawn with a line which has a width, and the path is the centre of that line, so you'll need to account for half the line width lying beyond the pathbbox..

Finally reliability; it isn't common but truly arbitrary PostScript could define it's own prologue to read the operators directly from systemdict, rather than using the current definitions (eg /myfill systemdict /fill load def). PostScript which did that would evade the redefinition of the operators, which would then not run the 'bbox' program and prevent it working.

So there's a solution; I don't claim it's the only one but I don't know of a better solution which will work purely in PostScript and with any PostScript interpreter. It will be quite an undertaking to write I should think.