Search code examples
pythonimage-processingtiffvips

libvips / pyvips access small sections of a multi-channel tiff (OME-Tiff)


Wondering if there's a speedy way to return specific pixel ranges of a given channel of an ome-tiff file using pyvips / libvips. The crop doesn't allow for channel specfics.

My OME-Tiff is large (10 GB+) so I don't want to load the entire image into memory.

Open to any suggestions and/or other workflows.


Solution

  • pyvips supports multipage documents as "toilet-roll" images (sorry). You set n=-1 to load all the pages, and they appear as a very tall, thin image, with the pages stacked vertically. The metadata item page-height gives the height in pixels of each sheet.

    Docs here:

    https://libvips.github.io/libvips/API/current/VipsForeignSave.html#vips-tiffload

    For example:

    $ vipsheader -a multi-channel-z-series.ome.tif 
    multi-channel-z-series.ome.tif: 439x167 char, 1 band, b-w, tiffload
    width: 439
    height: 167
    bands: 1
    format: char
    coding: none
    interpretation: b-w
    xoffset: 0
    yoffset: 0
    xres: 0
    yres: 0
    filename: multi-channel-z-series.ome.tif
    vips-loader: tiffload
    n-pages: 15
    image-description: <?xml version="1.0" encoding="UTF-8"?><!-- Warning: this comment is an OME-XML metadata block, which contains crucial dimensional parameters and other important metadata. Please edit cautiously (if at all), and back up the original data before doing so...
    resolution-unit: cm
    orientation: 1
    

    You can see this is a 15 page OME image. pyvips will load page 0 by default, and each page is 439 by 167 pixels. You can fetch the XML in image-description to see the full OME channel metadata.

    $ vipsheader -f image-description multi-channel-z-series.ome.tif
    <?xml version="1.0" encoding="UTF-8"?>
    <!--- ... etc.
    

    In Python you can do:

    $ python3
    Python 3.8.5 (default, Jul 28 2020, 12:59:40) 
    [GCC 9.3.0] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import pyvips
    >>> x = pyvips.Image.new_from_file("multi-channel-z-series.ome.tif", n=-1)
    >>> x.size
    >>> x.width
    439
    >>> x.height
    2505
    >>> x.get("page-height")
    167
    >>> x.height / x.get("page-height")
    15.0
    

    So you can use crop to fetch a rect from a channel in the obvious way.

    Are you planning to generate patches for ML training? If you are, fetch can be much faster than crop for small patches. This issue has sample code and some benchmarks --- in that example, crop takes 41s to make 12,000 32x32 patches, but fetch takes just 0.5s.