I am making use of the snakemake validate function, which seems heavily based on jsonschema, and it works fine for simple examples, however I am unsure how to proceed for more complex parameter settings.
Let's say I implemented the option for multiple peak callers (e.g. macs2 and genrich). Currently my config.yaml
looks like this:
peak_caller:
- macs2:
--shift -100 --extsize 200
- genrich:
-y -j
If no peak caller is specified I would like it to default to macs2 with these parameters, and if a anything other than either or both of these two peak callers is specified would like it to fail.
I tried different stuff with enumerators and arrays, but I could never get it to work properly:
$schema: "http://json-schema.org/draft-06/schema#"
description: snakemake-workflows peak calling configuration
properties:
# peak caller algorithms
peak_caller:
description: which peak caller(s) to use. Currently macs2 (default) and genrich are supported.
type: array
default: [macs2]
Preferably I would stay in yaml
format but I am open to configs written in json
.
properties:
peak_caller:
type: array
items:
anyOf:
- type: object
properties:
macs2: {type: string}
required: [macs2]
additionalProperties: false
- type: object
properties:
genrich: {type: string}
required: [genrich]
additionalProperties: false
maxItems: 2
uniqueItems: true
default:
- macs2: --shift -100 --extsize 200
Note however that this schema does not forbid giving either macs2
or genrich
two times with different parameters. For all I know, it is not possible to forbid that with the structure you're currently using. However, if the order of the items is not important, you could simply drop the array and use an object like this:
peak_caller:
macs2:
--shift -100 --extsize 200
genrich:
-y -j
Corresponding schema:
properties:
peak_caller:
type: object
properties:
macs2: {type: string}
genrich: {type: string}
minProperties: 1 # if you want to have at least one
additionalProperties: false
default:
macs2: --shift -100 --extsize 200
By default, JSONSchema does not require values for properties, so this schema is okay with only one option being defined.