I was trying to convert an image to a PDF file but using jpeg and png format. I read in the docs that PIL supports this but doesn't what I tried doesn't have any effect. I was wondering if anybody knows how to do this with using pil or without (I don't care). I prefer to use python but not closed to solve it in this way. See bellow what I tried and didn't achieve anything so far.
from PIL import Image
import requests
from io import BytesIO
from pathlib import Path
# my path
url = "https://github.com/tomasmarcos/tomrep/raw/master/image2encode.PNG"
response = requests.get(url)
img = Image.open(BytesIO(response.content))
# convert to binary through threesholding; but this doesnt matter , use "RGB" if you wish
img = img.convert("1")
img.save("example1.pdf", compression = "Flate")
size_in_kilobytes_ex1 = Path("example1.pdf").stat().st_size/1024 # in kilobytes}
img.save("example2.pdf", compression = "JPEG")
size_in_kilobytes_ex2 = Path("example2.pdf").stat().st_size/1024 # in kilobytes}
print(size_in_kilobytes_ex1,size_in_kilobytes_ex2)
#both have same size , it just means that are encoded the same way ;
I read that JPEG use Discrete cosine transform and must NOT have the same size as a PNG file which uses another encoding algorthim (Flate) , so this is incorrectly done.
Thanks in advance!!
I came up to the solution inspired by @KJ comments, using img2pdf. The problem is when you load the image in ram memory, pil decodes the image automatically and that was the issue, the library image2pdf does preserve the encoding, and you'll see that's true since I have reduced the pdf size a lot just by encoding with png instead of jpeg . If anyone has a better solution, that'd be nice.
from PIL import Image
import requests
from io import BytesIO
from pathlib import Path
import img2pdf
# my path
url = "https://github.com/tomasmarcos/tomrep/raw/master/image2encode.PNG"
response = requests.get(url)
img = Image.open(BytesIO(response.content))
# convert to binary through threesholding; but this doesnt matter , use "RGB" if you wish
img = img.convert("1")
## SOLUTION
img.save("example_png.png") #compression infered by pil as png
img.save("example_jpeg.jpeg") #compression infered by pil as jpeg
# for png
with open("examplepdf_png.pdf","wb") as f:
f.write(img2pdf.convert("example_png.png"))
#for jpeg
with open("examplepdf_jpeg.pdf","wb") as f:
f.write(img2pdf.convert("example_jpeg.jpeg"))
size_in_kilobytes_png = Path("examplepdf_png.pdf").stat().st_size/1024 # in kilobytes}
size_in_kilobytes_jpeg = Path("examplepdf_jpeg.pdf").stat().st_size/1024 # in kilobytes}
print(size_in_kilobytes_jpeg,size_in_kilobytes_png)
#both DO NOT have the same size , it just means that are encoded in a different way
# , see they are the same image but different encoding!
Additional information: Empirically , when it comes to see the size of an image (which compression works better?):
• jpeg over png is preferred for rgb images ; png is prefered over jpeg for grayscale / binary images
• the higher the resolution of the image, the more you'll note this difference.