I would like to read all images found in a pdf
file by PyMuPDF
as opencv
images, as close as they are from the source (avoiding funky format conversions that would lead to precision loss). Basically, I would like the result to be the exact same as if I was doing a cv2.imread(filename):
(in terms of the type it outputs, color space, etc...)
# Libraries
import os
import cv2
import fitz
import numpy as np
# Input file
filename = "myfile.pdf"
# Read all images in file as a list of opencv images
def read_images(filename):
images = []:
_, extension = os.path.splitext(filename)
# If it's a pdf process each image
if (extension == ".pdf"):
pdf = fitz.open(file)
for index in range(len(pdf)):
page = pdf[index]
for im in page.getImageList():
xref = im[0]
pix = fitz.Pixmap(pdf, xref)
images.append(pix_to_opencv_image(pix)) # DO SOMETHING HERE
# Otherwise just do an imread
else:
images.append(cv2.imread(filename))
return images
Basically I would like to know what the function pix_to_opencv_image
should be:
# Equivalent of doing a "cv2.imread" on a pdf pixmap:
def pix_to_opencv_image(pix):
# DO SOMETHING HERE
If found example explaining how to convert pdf pixmaps to numpy arrays, but nothing that outputs an opencv image.
How can I achieve this?
I used help()
function to find the various data descriptors associated with it -->
help(pix)
pix.samples
stores the image information as bytes. Using numpy
's frombuffer
, the image array can be obtained from these bytes after reshaping accordingly.
pix.height
and pix.width
gives the height and width of the image array respectively. pix.n
is the number of channels. These can be used for reshaping the resulting array.
Your complete function would be:
def pix_to_image(pix):
bytes = np.frombuffer(pix.samples, dtype=np.uint8)
img = bytes.reshape(pix.height, pix.width, pix.n)
return img
You can display the result using cv2.imshow()
.