How can I define an output that can be either one thing or another in Snakemake?

In a pipeline that I use to work on different projects, I have a rule that takes a file, following the pattern tei/xxx_xx_xxxxx_xxxxx.xml as input. Depending on the project 2 possible outputs are possible, either one file called xhtml/xxx_xx_xxxxx_xxxxx.html or many files following the pattern xhtml/xxx_xx_xxxxx_xxxxx_sec_n (where n is a counter for the different files).

The problem is that it is not predictable at the beginning if the project is a case 1 or a case 2 project. It is decided in the script that is run as the action of the rule. Thus, I neither know, how to define the input in the default rule which request those file(s) nor how to define the output of the rule that creates those file(s).

I think it is probably a case for using checkpoint(), but from the examples I found I was not able to see how.

This is a simplified/reduced version of the scenario:

rule all:
    input: # How to define the input when it is not clear if it is case 1 file or case 2 files

rule xhtml_manuscript:
    input: 
        tei_manuscript = 'tei/xxx_xx_xxxxx_xxxxx.html'
    output: 
        xhtml_manuscript = # How to define the input when it is not clear if it is case 1 file or case 2
    run: 
        shell(f'java -jar {SAXON} -o:xxx_xx_xxxxx_xxxxx.html {{input}} {TRANSFORMDIR}/other/opt_split_html_sections.xsl')

Possible output:

xxx_xx_xxxxx_xxxxx.html

xxx_xx_xxxxx_xxxxx_sec_1.html
xxx_xx_xxxxx_xxxxx_sec_2.html
xxx_xx_xxxxx_xxxxx_sec_3.html
xxx_xx_xxxxx_xxxxx_sec_4.html
xxx_xx_xxxxx_xxxxx_sec_5.html
...

Solution

This is just Sultan's answer made more explicit. OP asks in comment:

the rule still creates the html file(s) but I do not mention them in the output explicitly, in favour of the tmp file

Yes, that's the idea. In fact, I would call the tmp file a "flag" file and I wouldn't mark is temporary. E.g:

rule all:
    input:
        'tei/xxx_xx_xxxxx_xxxxx.done',

rule xhtml_manuscript:
    input: 
        tei_manuscript = 'tei/xxx_xx_xxxxx_xxxxx.html'
    output: 
        # Note the touch function
        xhtml_manuscript = touch('tei/xxx_xx_xxxxx_xxxxx.done'),
    run: 
        shell(f'java -jar {SAXON} -o:xxx_xx_xxxxx_xxxxx.html {{input}} {TRANSFORMDIR}/other/opt_split_html_sections.xsl')

it [the flag file] would probably make the xhtml_manuscript succeed

Not really, snakemake will touch the flag file tei/xxx_xx_xxxxx_xxxxx.done only if the run or shell directive succeeds. So if the flag file is present you can be sure the underlying rule has exited with 0 exit code. Besides, you don't need to use the touch function and you could explicitly check that some files have been created. You could do:

shell: 
    """
    rm -rf <expected output html files>

    java -jar <create html file(s)>

    if this or that html file exists:
        touch {output.xhtml_manuscript}
    else:
        exit 1
    """

Is that not a bit dirty and intransparent

I don't know... I got used to this way of handling such cases and it looks ok to me. Ultimately though, I would say the "dirt" may be more with the structure of the pipeline or the program causing the ambiguous output. I think snakemake is doing the right thing in making such cases somewhat clunky.