Search code examples
pythonimagesnakemake

Snakemake subtract a mask from two channels


I have a project of several microscopy recordings that I need to process. The recordings have two channels (red and green), so the structure is like this:

rec1/Ch0/video.tif
rec1/Ch1/video.tif
rec2/Ch0/video.tif
rec2/Ch1/video.tif
etc..

I need to do several operations, most of them equally to both channels. For example, cropping, blurring, etc. But then I need to create a mask on the red channel only (Ch0) and apply this mask on both channels to measure the intensity inside the mask. How can I do this?

The result should look like this:

rec1/Ch0/video.tif
rec1/Ch0/video_cropped_blurred.tif
rec1/Ch0/video_cropped_blurred_mask.tif
rec1/Ch0/video_cropped_blurred_mask_measurements.csv

rec1/Ch1/video.tif
rec1/Ch1/video_cropped_blurred.tif
rec1/Ch1/video_cropped_blurred_mask_measurements.csv

rec2/Ch0/video.tif
rec2/Ch0/video_cropped_blurred.tif
rec2/Ch0/video_cropped_blurred_mask.tif
rec2/Ch0/video_cropped_blurred_mask_measurements.csv

rec2/Ch1/video.tif
rec2/Ch1/video_cropped_blurred.tif
rec2/Ch1/video_cropped_blurred_mask_measurements.csv
etc..

So far I have:

import glob

RED_CHANNEL = glob.glob("rec*/Ch0/")
GREEN_CHANNEL = glob.glob("rec*/Ch1/")

rule targets:
    input:
        red_blur = expand("{red_datasets}video_blurred.tif", red_datasets=RED_CHANNEL),
        green_blur = expand("{green_datasets}video_blurred.tif", green_datasets=GREEN_CHANNEL),
        red_mask  = expand("{red_datasets}video_blurred_mask.tif", red_datasets=RED_CHANNEL),

rule red_blur:
    input:
        raw_img ="{red_datasets}video.tif"
    output:
        blurred_img = "{red_datasets}video_blurred.tif"
    shell:
        "python blur.py -i {input.raw_img} -o {output.blurred_img}"

rule green_blur:
    input:
        raw_img ="{green_datasets}video.tif"
    output:
        blurred_img = "{green_datasets}video_blurred.tif"
    shell:
        "python blur.py -i {input.raw_img} -o {output.blurred_img}"

But I get the following error:

Rules green_blur and red_blur are ambiguous for the file ../Ch0/video.tif
Consider starting rule output with a unique prefix, constrain your wildcards, or use the ruleorder directive.
Wildcards:
        green_blur: green_datasets=../Ch0/
        red_subtract_background: red_datasets=../Ch0/

Why does the green dataset wildcard also get the Ch0? I could have then only one blur rule, but how would I then create a rule mask, that masks Ch1 with the Ch0 mask? How can I access the file rec*/Ch0/video_cropped_blurred_mask.tif for rules that apply to Ch1?

In other words, how can I make this rule?

rule mask:
    input: 
        img = "{green_datasets}_cropped_blurred.tif",
        mask = "{red_datasets}_cropped_blurred_mask.tif"
    output:
        measurements = "{green_datasets}_mask_measurements.csv"
    shell:
       "python measure.py -i {input.img} -mask {input.mask} -o {output.measurements}"

Thank you very much!


Solution

  • Echoing @euronion, the red and green wildcards are just that, wildcards. Their names don't matter and are effectively replaced with a .+ regex. But that also imparts some power that you aren't leveraging, namely using a rule generically using wildcards. Here's a rewrite of your workflow:

    red_channel, green_channel = 0, 1  # ideally set in config
    
    # get recordings for ch0, will validate ch1 exists when requesting all outputs
    recordings = glob_wildcards('rec{recording}/Ch0/video.tif').recording
    
    rule targets:  # need only request final output
        input:
            expand('rec{recording}/Ch{channel}/video_mask_measurements.csv',
                   recording=recordings,
                   channels=[red_channel, green_channel],
                  )
    
    rule blur:  # this can be generic for any channel
        input:
            raw_img ="{base}video.tif"  # wildcard can be anything
        output:
            blurred_img = "{base}video_blurred.tif"
        shell:
            "python blur.py -i {input.raw_img} -o {output.blurred_img}"
    
    rule create_mask:
        input: 
            img = "rec{recording}/Ch{channel}/video_blurred.tif",
        output:
            mask = "rec{recording}/Ch{channel}/video_mask.tif",
        shell:
           "python make_mask.py -i {input.img} -o {output}"
    
    rule use_mask:  # this is also generic
        input: 
            img = "rec{recording}/Ch{channel}/video_blurred.tif",
            mask = expand("rec{recording}/Ch{channel}/video_mask.tif",
                         channel=red_channel,  # but the mask can only be red
                         allow_missing=True),
        output:
            measurements = "rec{recording}/Ch{channel}/video__mask_measurements.csv"
        shell:
           "python measure.py -i {input.img} -mask {input.mask} -o {output.measurements}"
    
    

    blur, create_mask, and use_mask are generic to either channel. However, use_mask only requests a mask for the red_channel, so that is the only one that is run.

    Based on your filenames I missed a cropping step, but this should address the channel issues.