Search code examples
windowsimagepdfimagemagickjpeg

Convert all .jpg images in directory to .pdf | Debugging ImageMagick


I have found multiple posts regarding this issue and everyone is recommending ImageMagick but for me, it doesn't seem to work as intended and I don't see too many docs on their site regarding .jpg->.pdf conversion.

Is there some alternative, preferably CLI tool?

Or can I somehow debug why ImageMagick doesn't work for me? I don't get any errors I just get corrupted files as a result.

My usecase


My os is Windows and I have 64 .jpg files called 0.jpg, 2.jpg, ... 63.jpg and I would like to merge all those images into one .pdf file.

I have tried these commands:

magick *.jpg out.pdf

convert *.jpg out.pdf

but in both cases, I am unable to open the out.pdf file because it is corrupted. I have noticed that I can only convert 0.jpg file to pdf correctly but when I try to convert any other of my 64 jpg files then as a result I am getting a corrupted .pdf file
For example:
This gives me the correct .pdf:

magick 0.jpg 0.pdf

but this gives me corrupted .pdf:

magick 2.jpg 2.pdf

I assume that this a reason why I can't merge all of the files into one not corrupted .pdf file and my assumption is that there is something wrong with the rest of my .jpg files but I have no idea how to debug this issue. Every other .jpg file looks exactly the same as the one .jpg I can convert and all of them open without issues.

magick identify -verbose foobar.jpg results:


I can convert 0.jpg file to .pdf correctly but 2.jpg results in corrupted .pdf. There are some apparent differences but I am not sure what those properties mean in the context of .jpg -> .pdf conversion

magic identify -verbose


Solution

  • One thought is that someone has converted the grayscale image to color with 3 equal channels so that it IM says it has colorspace RGB. However, the JPEG colorspace tag is 2, which says it has no specific colorspace.

    Properties:
        date:create: 2021-04-01T17:29:06+00:00
        date:modify: 2021-04-01T05:18:58+00:00
        exif:ExifOffset: 46
        exif:ExifVersion: 48, 50, 50, 48
        exif:PixelXDimension: 960
        exif:PixelYDimension: 1508
        exif:Software: Google
        jpeg:colorspace: 2
        jpeg:sampling-factor: 2x2,1x1,1x1
    

    From the JPG docs

    ColorSpace

    0 = Bi-level 
    1 = YCbCr, ITU-R BT 709, video 
    2 = No color space specified 
    3 = YCbCr, ITU-R BT 601-1, RGB 
    4 = YCbCr, ITU-R BT 601-1, video 
    8 = Gray-scale 
    9 = PhotoYCC 
    10 = RGB 
    11 = CMY 
    12 = CMYK 
    13 = YCCK 
    14 = CIELab
    

    It is possible that this conflict or lack of colorspace may confuse certain viewers after the file is imbedded in a PDF vector shell.