Search code examples
pdfadobe-illustratorcolor-profilecolor-management

How do I dump embedded ICC profile information in PDF? (command line or GUI tools)


It there a command line or GUI tools to dump information about ICC Profile/color conversion, which are set "Color management and PDF/X options for PDF" option of Illustrator's PDF export dialog?


"Color management and PDF/X options for PDF" option of Illustrator

[image] http://blogs.adobe.com/vikrant/files/2012/05/grayscale_export.png

[manual] http://help.adobe.com/en_US/illustrator/cs/using/WS714a382cdf7d304e7e07d0100196cbc5f-6547a.html#WS714a382cdf7d304e7e07d0100196cbc5f-6540a


Solution

  • Here is a command line based method to extract ICC color profiles from a PDF. It uses the Python script pdf-parser.py written by security researcher Didier Stevens which you can download here.

    However, this tool is not a specialized tool for ICC extraction. (I do not know such a tool.) It is a generic command line tool to investigate PDF files.

    Therefor you need to go through various steps in order to achieve the extraction.

    Step 1: Determine the PDF object ID of the ICC profile

    You have to use -s to search for the string ICCBased. (PDF files without an embedded ICC profile will not have this keyword [with the exception of possibly using it in their text contents...].)

    pdf-parser -s ICCBased my.pdf
    

    My test PDF returned this:

    obj 18 0
     Type: 
     Referencing: 21 0 R
    

    It seems that an ICC profile is to be found in PDF object 21.

    Step 2: Look at the PDF object found in step 1

    You have to use -o 21 to see what PDF object 21 is:

    pdf-parser.py -o 21 my.pdf
    

    My test PDF returns this:

    obj 21 0
     Type: 
     Referencing: 
     Contains stream
    
      <<
        /Alternate /DeviceRGB
        /Filter /FlateDecode
        /Length 2574
        /N 3
      >>
    

    Ok, this looks like we are getting close...

    Step 3: Dump the stream contained in the PDF object containing the profile

    In step 2 we acquired two important infos:

    • The PDF object 21 contains a stream (the contents of which are not shown by using the -o 21 parameter of pdf-parser.py).
    • The object stream has to be de-compressed with the /FlateDecode in order to get to its content.

    Hence we have to run pdf-parser.py now with two additional arguments:

    • -d filename in order to dump the stream of PDF object 21 to a file.
    • -f in order to filter/un-compress the object stream when dumping it to a file.
    • Command to run: pdf-parser.py -o 21 -f -d 21.stream my.pdf

    Step 4: Verify what was extracted

    We now have dumped the stream of PDF object 21 to a file named 21.stream. Let's see what it contains:

    file 21.stream
     21.stream: Microsoft ICM Color Profile
    

    Looks like we succeeded. :-)

    Step 5: Open the color profile

    I'll see if my Mac OSX system does accept this profile:

    mv 21.stream 21.icm
    open 21.icm
    

    OSX uses the 'Color Sync Utility' to open the file and display a window. Clicking on the list entries opens different information panes at the bottom of the window:

    Mac OSX  'Color Sync Utility' showing various infos about the extracted ICM profile.

    Step 6: Use Argyll's iccdump to dump the contents of the ICC profile as text

    Note, that Graeme Gill's ArgyllCMS, the open source color management software, available for Linux, Mac OSX and Windows, ships with a whole suite of command line tools. One of these is iccdump. We can use it to look at the properties of the newly won 21.icm file:

    iccdump 21.icm
    
    icc:
    Header:
      size         = 3144 bytes
      CMM          = 'Lino'
      Version      = 2.1.0
      Device Class = Display
      Color Space  = RGB
      Conn. Space  = XYZ
      Date, Time   = 9 Feb 1998, 6:49:00
      Platform     = Microsoft
      Flags        = Not Embedded Profile, Use anywhere
      Dev. Mnfctr. = 'IEC '
      Dev. Model   = 'sRGB'
      Dev. Attrbts = Reflective, Glossy
      Rndrng Intnt = Perceptual
      Illuminant   = 0.964203, 1.000000, 0.824905    [Lab 100.000000, 0.000498, -0.000436]
      Creator      = 'HP  '
    
    tag 0:
      sig      'cprt'
      type     'text'
      offset   336
      size     51
    Text:
      No. chars = 43
        0x0000: Copyright (c) 1998 Hewlett-Packard Company
    
    tag 1:
      sig      'desc'
      type     'desc'
      offset   388
      size     108
    TextDescription:
      ASCII data, length 18 chars:
        0x0000: sRGB IEC61966-2.1
      No Unicode data
      ScriptCode Data, Code 0x0, length 18 chars
        0x0000: 73 52 47 42 20 49 45 43 36 31 39 36 36 2d 32 2e 31 00 
    
    tag 2:
      sig      'wtpt'
      type     'XYZ '
      offset   496
      size     20
    XYZArray:
      No. elements = 1
    
    tag 3:
      sig      'bkpt'
      type     'XYZ '
      offset   516
      size     20
    XYZArray:
      No. elements = 1
    
    tag 4:
      sig      'rXYZ'
      type     'XYZ '
      offset   536
      size     20
    XYZArray:
      No. elements = 1
    
    tag 5:
      sig      'gXYZ'
      type     'XYZ '
      offset   556
      size     20
    XYZArray:
      No. elements = 1
    
    tag 6:
      sig      'bXYZ'
      type     'XYZ '
      offset   576
      size     20
    XYZArray:
      No. elements = 1
    
    tag 7:
      sig      'dmnd'
      type     'desc'
      offset   596
      size     112
    TextDescription:
      ASCII data, length 22 chars:
        0x0000: IEC http://www.iec.ch
      No Unicode data
      ScriptCode Data, Code 0x0, length 22 chars
        0x0000: 49 45 43 20 68 74 74 70 3a 2f 2f 77 77 77 2e 69 65 63 2e 63 68 00 
    
    tag 8:
      sig      'dmdd'
      type     'desc'
      offset   708
      size     136
    TextDescription:
      ASCII data, length 46 chars:
        0x0000: IEC 61966-2.1 Default RGB colour space - sRGB
      No Unicode data
      ScriptCode Data, Code 0x0, length 46 chars
        0x0000: 49 45 43 20 36 31 39 36 36 2d 32 2e 31 20 44 65 66 61 75 6c 74 20 
    ...
    
    tag 9:
      sig      'vued'
      type     'desc'
      offset   844
      size     134
    TextDescription:
      ASCII data, length 44 chars:
        0x0000: Reference Viewing Condition in IEC61966-2.1
      No Unicode data
      ScriptCode Data, Code 0x0, length 44 chars
        0x0000: 52 65 66 65 72 65 6e 63 65 20 56 69 65 77 69 6e 67 20 43 6f 6e 64 
    ...
    
    tag 10:
      sig      'view'
      type     'view'
      offset   980
      size     36
    Viewing Conditions:
      XYZ value of illuminant in cd/m^2 = 19.644501, 20.371796, 16.808899
      XYZ value of surround in cd/m^2   = 3.928894, 4.074387, 3.361786
      Illuminant type = D50
    
    tag 11:
      sig      'lumi'
      type     'XYZ '
      offset   1016
      size     20
    XYZArray:
      No. elements = 1
    
    tag 12:
      sig      'meas'
      type     'meas'
      offset   1036
      size     36
    Measurement:
      Standard Observer = 1931 Two Degrees
      XYZ for Measurement Backing = 0.000000, 0.000000, 0.000000    [Lab 0.000000, 0.000000, 0.000000]
      Measurement Geometry = Unknown
      Measurement Flare =   1.0%
      Standard Illuminant = D65
    
    tag 13:
      sig      'tech'
      type     'sig '
      offset   1072
      size     12
    Signature
      Technology = Cathode Ray Tube Display
    
    tag 14:
      sig      'rTRC'
      type     'curv'
      offset   1084
      size     2060
    Curve:
      No. elements = 1024
    
    tag 15:
      sig      'gTRC'
      type     'curv'
      offset   1084
      size     2060
    Curve:
      No. elements = 1024
    
    tag 16:
      sig      'bTRC'
      type     'curv'
      offset   1084
      size     2060
    Curve:
      No. elements = 1024
    

    P.S.:
    ArgyllCMS contains a command line tool, extracticc, which can extract an embedded ICC profile from a TIFF file. It does not have a tool to extract a profile from a PDF file.