Search code examples
pythondocopt

Accepting more than one value for more than one argument causes unexpected result


I have the following docopt __doc__ :

"""gene_sense_distribution_to_csv
Usage:
    gene_sense_distribution_to_csv.py <gene> <csv_out> <filenames>... [--min_count=<kn>] [--gene_regions=<kn>]
    gene_sense_distribution_to_csv.py -h | --help
    gene_sense_distribution_to_csv.py params
    gene_sense_distribution_to_csv.py example

Options:
    -h --help            Shows this screen.
    --min_count=<kn>     Minimal number of reads in a pileup line in order to consider it in the analysis [default: 0]
    --gene_regions=<kn>  All the gene regions you want to plot. Options: 5utr, 3utr, exon. The default is all of them. Separate them by space [default: ['5utr', '3utr', 'exon']]
"""

And i'm trying to make --gene_regions argument accept more than one argument like <filenames> does.

I tried to do that by changing this line
gene_sense_distribution_to_csv.py <gene> <csv_out> <filenames>... [--min_count=<kn>] [--gene_regions=<kn>]
To :
gene_sense_distribution_to_csv.py <gene> <csv_out> <filenames>... [--min_count=<kn>] [--gene_regions=<kn>...]

But by trying to execute the command python SCRIPT_NAME GENE OUTPUT_PATH INPUT_PATH --gene_regions exon intron
I get the following args:

{'--gene_regions': ['exon'],
 '--help': False,
 '--min_count': '0',
 '<csv_out>': 'out',
 '<filenames>': ['example', 'intron'],
 '<gene>': 'Y74C9A.6',
 'example': False,
 'params': False}

As you can tell, the <filenames> argument got the intron while I meant that --gene_regions would have it.
Any ideas on how can I fix this?

EDIT: I came across a workaround, simply execute python SCRIPT_NAME GENE OUTPUT_PATH INPUT_PATH --gene_regions 'exon, intron' and parse it.

Would still appreciate an answer which is not a workaround.


Solution

  • You need to repeat the option as well as the value, e.g:

    ... --gene_regions exon --gene_region intron
    

    Here is a full example (going with "gene_region", singular):

    """gene_sense_distribution_to_csv
    Usage:
        gene_sense_distribution_to_csv.py <gene> <csv_out> <filenames>... [--min_count=<kn>] [--gene_region=<kn> ...]
        gene_sense_distribution_to_csv.py -h | --help
        gene_sense_distribution_to_csv.py params
        gene_sense_distribution_to_csv.py example
    
    Options:
        -h --help            Shows this screen.
        --min_count=<kn>     Minimal number of reads in a pileup line in order to consider it in the analysis [default: 0]
        --gene_region=<kn>   All the gene regions you want to plot. Options: 5utr, 3utr, exon. The default is all of them. Specify multiple times...
    """
    
    $ ./gene_sense_distribution_to_csv.py gene1 csvout file1 file2 --min_count=mm --gene_region=r2 --gene_region=r3 --gene_region r4
    {'--gene_region': ['r2', 'r3', 'r4'],
     '--help': False,
     '--min_count': 'mm',
     '<csv_out>': 'csvout',
     '<filenames>': ['file1', 'file2'],
     '<gene>': 'gene1',
     'example': False,
     'params': False}
    

    OT: It can be a good idea to put options before arguments, gives you more freedom in some cases (see the -- special argument), ref https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap12.html#tag_12_02

    gene_sense_distribution_to_csv.py [--min_count=<kn>] [--gene_region=<kn> ...] <gene> <csv_out> <filenames>...