I'd like to mark several keywords in a pdf document using Python and pymupdf.
The code looks as follows (source: original code):
import fitz
doc = fitz.open("test.pdf")
page = doc[0]
text = "result"
text_instances = page.searchFor(text)
for inst in text_instances:
highlight = page.addHighlightAnnot(inst)
highlight.setColors(colors='Red')
highlight.update()
doc.save("output.pdf")
However, the text gets only marked on one page. I tried changing the code as described in the documentation for pymupdf (documentation) so it slices over all pages.
import fitz
doc = fitz.open("test.pdf")
for page in doc.pages(1, 3, 1):
pass
text = "result"
text_instances = page.searchFor(text)
for inst in text_instances:
highlight = page.addHighlightAnnot(inst)
highlight.setColors(colors='Red')
highlight.update()
doc.save("output.pdf")
Unfortunately, it still only marks the keywords on one page. What do I need to change, so the keywords get marked on all pages?
There are 2 major issues you had with your code:
Otherwise your understanding of the code seems fine.
for page in doc.pages(1, 3, 1):
pass
If you want to loop over pages, you would need to put your highlight code inside the page loop. In addition, you are starting on page 2, not page 1 because page 1 is represented by index 0.
#! /usr/bin/env python
# -*- coding: utf-8 -*-
import fitz
doc = fitz.open("test.pdf")
text = "result"
# page = doc[0]
# for page in doc.pages(start, stop, step):
for page in doc.pages(0, 3, 1):
text_instances = page.searchFor(text)
for inst in text_instances:
highlight = page.addHighlightAnnot(inst)
highlight.setColors(colors='Red')
highlight.update()
doc.save("output.pdf")