I am documenting a system I maintain. This documentation contains a diagram I created in TeX/TikZ which gets rendered to a PDF file. Then I convert the PDF file to an image file (PNG via imagemagick), and include it in my HTML documentation. Works great.
Now I would like to create an image map for the image, so that I can add hyperlinks/mouseovers/etc. This is an image that I expect to update periodically based on changes in my system, so I would like to automate this process if possible.
Is there a way to use a software library or tool to automatically create image maps of the various text content in the PDF file, when it gets rendered to PNG?
Here is an example from this gist I created:
In this case I would like to turn some of the various text strings into hyperlinks by locating their bounding box in the PDF:
controller
actuator
sensor
A
B
C
D
u
y
F(s)
G(s)
H(s)
(They are all text content in the PDF file; I can select the text of any of them in Acrobat Reader and copy + paste into my text editor.)
Is there a way to do this?
I was able to put together the following Python solution that could serve as a starting point. It converts the pdf to a png and outputs corresponding image map markup.
It takes output dpi as an optional argument (default 200) in order to properly scale the bounding boxes onto the png from the default pdf dpi of 72:
from pdf2image import convert_from_path
from pdfminer.converter import PDFPageAggregator
from pdfminer.layout import LAParams, LTTextBox
from pdfminer.pdfinterp import PDFPageInterpreter
from pdfminer.pdfinterp import PDFResourceManager
from pdfminer.pdfpage import PDFPage
from yattag import Doc, indent
import argparse
import os
def transform_coords(lobj, mb):
# Transform LTTextBox bounding box to image map area bounding box.
#
# The bounding box of each LTTextBox is specified as:
#
# x0: the distance from the left of the page to the left edge of the box
# y0: the distance from the bottom of the page to the lower edge of the box
# x1: the distance from the left of the page to the right edge of the box
# y1: the distance from the bottom of the page to the upper edge of the box
#
# So the y coordinates start from the bottom of the image. But with image map
# areas, y coordinates start from the top of the image, so here we subtract
# the bounding box's y-axis values from the total height.
return [lobj.x0, mb[3] - lobj.y1, lobj.x1, mb[3] - lobj.y0]
def get_imagemap(d):
doc, tag, text = Doc().tagtext()
with tag("map", name="map"):
for k, v in d.items():
doc.stag("area", shape="rect", coords=",".join(v), href="", alt=k)
return indent(doc.getvalue())
def get_bboxes(pdf, dpi):
fp = open(pdf, "rb")
rsrcmgr = PDFResourceManager()
device = PDFPageAggregator(rsrcmgr, laparams=LAParams())
interpreter = PDFPageInterpreter(rsrcmgr, device)
page = list(PDFPage.get_pages(fp))[0]
interpreter.process_page(page)
layout = device.get_result()
# PDFminer reports bounding boxes based on a dpi of 72. I could not find a way
# to change this, so instead I scale each coordinate by multiplying by dpi/72
scale = dpi / 72.0
return {
lobj.get_text().strip(): [
str(int(x * scale)) for x in transform_coords(lobj, page.mediabox)
]
for lobj in layout
if isinstance(lobj, LTTextBox)
}
def main():
parser = argparse.ArgumentParser()
parser.add_argument("pdf")
parser.add_argument("--dpi", type=int, default=200)
args = parser.parse_args()
page = list(convert_from_path(args.pdf, args.dpi))[0]
page.save(f"{os.path.splitext(args.pdf)[0]}.png", "PNG")
print(get_imagemap(get_bboxes(args.pdf, args.dpi)))
if __name__ == "__main__":
main()
Example result:
<img src="https://i.sstatic.net/aXWMc.png" usemap="#map">
<map name="map">
<area shape="rect" coords="361,8,380,43" href="#" alt="B" />
<area shape="rect" coords="434,31,500,64" href="#" alt="G(s)" />
<area shape="rect" coords="432,93,502,117" href="#" alt="actuator" />
<area shape="rect" coords="552,8,572,42" href="#" alt="C" />
<area shape="rect" coords="596,58,609,86" href="#" alt="y" />
<area shape="rect" coords="105,26,119,40" href="#" alt="+" />
<area shape="rect" coords="107,54,122,78" href="#" alt="−" />
<area shape="rect" coords="35,58,51,86" href="#" alt="u" />
<area shape="rect" coords="164,8,182,43" href="#" alt="A" />
<area shape="rect" coords="163,152,183,187" href="#" alt="D" />
<area shape="rect" coords="241,31,311,64" href="#" alt="H(s)" />
<area shape="rect" coords="236,94,316,118" href="#" alt="controller" />
<area shape="rect" coords="243,175,309,208" href="#" alt="F (s)" />
<area shape="rect" coords="247,234,305,258" href="#" alt="sensor" />
</map>