Search code examples
c++tifflibtiff

How to get all tags from a tiff file with libtiff?


I have a tiff file and would like to get a list of all tags used in that file. If I understand the TiffGetField() function correctly, it only gets the values of tags specified. But how do I know what tags the file uses? I would like to get all used tags in the file. Is there an easy way to get them with libtiff?


Solution

  • It seems to be a very manual process from my experience. I used the TIFF tag reference here https://www.awaresystems.be/imaging/tiff/tifftags.html to create a custom structure

    typedef struct
    {
        TIFF_TAGS_BASELINE Baseline;
        TIFF_TAGS_EXTENSION Extension;
        TIFF_TAGS_PRIVATE Private;
    } TIFF_TAGS;
    

    With each substructure custom defined. For example,

    typedef struct
    {
        TIFF_UINT32_T NewSubfileType; // TIFFTAG_SUBFILETYPE
        TIFF_UINT16_T SubfileType; // TIFFTAG_OSUBFILETYPE
        TIFF_UINT32_T ImageWidth; // TIFFTAG_IMAGEWIDTH
        TIFF_UINT32_T ImageLength; // TIFFTAG_IMAGELENGTH
        TIFF_UINT16_T BitsPerSample; // TIFFTAG_BITSPERSAMPLE
        ...
        char *Copyright; // TIFFTAG_COPYRIGHT
    } TIFF_TAGS_BASELINE;
    

    Then I have custom readers:

    TIFF_TAGS *read_tiff_tags(char *filename)
    {
        TIFF_TAGS *tags = NULL;
        TIFF *tif = TIFFOpen(filename, "r");
        if (tif)
        {
            tags = calloc(1, sizeof(TIFF_TAGS));
            read_tiff_tags_baseline(tif, tags);
            read_tiff_tags_extension(tif, tags);
            read_tiff_tags_private(tif, tags);
            TIFFClose(tif);
        }
    
        return tags;
    }
    

    Where you have to manually read each field. Depending on if it's an array, you'll have to check the return status. For simple fields, it's something like

    // The number of columns in the image, i.e., the number of pixels per row.
    TIFFGetField(tif, TIFFTAG_IMAGEWIDTH, &tags->Baseline.ImageWidth);
    

    but for array fields you'll need something like this

        // The scanner model name or number.
        status = TIFFGetField(tif, TIFFTAG_MODEL, &infobuf);
        if (status)
        {
            len = strlen(infobuf);
            tags->Baseline.Model = malloc(sizeof(char) * (len + 1));
            _mysprintf(tags->Baseline.Model, (int)(len + 1), "%s", infobuf);
            tags->Baseline.Model[len] = 0;
        }
        else
        {
            tags->Baseline.Model = NULL;
        }
        // For each strip, the byte offset of that strip.
        status = TIFFGetField(tif, TIFFTAG_STRIPOFFSETS, &arraybuf);
        if (status)
        {
            tags->Baseline.NumberOfStrips = TIFFNumberOfStrips(tif);
            tags->Baseline.StripOffsets = calloc(tags->Baseline.NumberOfStrips, sizeof(TIFF_UINT32_T));
            for (strip = 0; strip < tags->Baseline.NumberOfStrips; strip++)
            {
                tags->Baseline.StripOffsets[strip] = arraybuf[strip];
            }
        }
        else
        {
            tags->Baseline.StripOffsets = NULL;
        }
    

    My suggestion is to only read the fields you want/need and ignore everything else. Hope that helps.