Search code examples
bashflagsgetopt

Getopt sees extra '--' argument that I haven't included in the command


I'm trying to write some code that will tie together a series of conda tools and a bit of my own python code. I've provided some getopt options, but they parse weirdly. I'd like options to be able to be provided in any order, and I'd like both short and long options available to use. I've provided the code where I define default options as well as the main getopt section. Here's the relevant code snippet:

seq_tech=''
input=''
outfolder='ProkaRegia'
threads=$(grep ^cpu\\scores /proc/cpuinfo | uniq |  awk '{print $4}')

is_positive_integer() {
    re='^[0-9]+$'
    if ! [[ $1 =~ $re ]] ; then
       return 1
    fi
    if [ "$1" -le 0 ]; then
        return 1
    fi
    return 0
}
...

ARGS=$(getopt -o i:o::t::s:c::h:: -l 'input:,output::,threads::,seq_tech:,clean::,help::' -n 'prokaregia.sh' -- "$@")
eval set -- "$ARGS"

echo "All arguments: $@"

if [[ $# -eq 1 ]]; then
    usage
fi

while [[ $# -gt 0 ]]; do
    case $1 in
        -i|--input)
            input="$(readlink -f "$2")"
            echo "$2"
            shift 2
            ;;
        -o|--output)
            output=$2
            shift 2
            ;;
        -t|--threads)
            if is_positive_integer "$2"; then
                threads=$2
                shift 2
            else
                echo "Error: Thread count must be a positive integer."
                exit 1
            fi
            ;;
        -s|--seq_tech)
            if [[ $2 == "ont" || $2 == "pacbio" ]]; then
                seq_tech=$2
                shift 2
            else
                echo "Error: Sequencing technology must be either 'ont' or 'pacbio'."
                exit 1
            fi
            ;;
        -c|--clean)
            clean_option=true
            shift
            ;;
        -h|--help)
            usage
            ;;
        *)
            echo "Error: Invalid option $1"
            exit 1
            ;;
    esac
done

However, on running the following command:

bash prokaregia.sh -t 2 -i prokaregia.dockerfile

I get the following returned:

All arguments: -t  -i prokaregia.dockerfile -- 2
Error: Thread count must be a positive integer.

is_positive_integer works perfectly from the command line (thanks chatgpt!), and changing the command line options to "--input" and "--threads" results in the same behavior, as does changing the order. I'm fairly certain the issue is coming from whatever is generating the extra double hyphen in the list of arguments. It's also generating an extra blank space, since when I try echo "$2 in the threads function it returns a blank. Various other issues arise with other options from these same characters, which I'd be happy to go into if people think it would be helpful.


Solution

  • You seem to be interpreting the two colons after an option name in the wrong way. The syntax t:: does not mean "option -t is optional", it means "option -t takes an optional argument". All options are considered optional, you should check if mandatory options are supplied by yourself.

    As per why the argument does not get parsed, this looks like a quirk of getopt. In case of options with optional arguments (two colons :: after the option name), getopt only recognizes the argument if it is specified along the option itself without whitespace between the two. So for example -t2 or --threads=2 works, while -t 2 or --threads 2 does not. This is probably because in the case of optional option arguments it is impossible for getopt to understand whether you are giving an option without argument followed by a positional argument OR an option with an argument followed by no positional argument.

    For mandatory arguments however this does not seem to happen and you can write any of -t2, -t 2, --threads 2 or --threads=2 and get the same result.

    I don't think you want optional option arguments anyway, so just remove all the additional colons you added and everything should work as you wish:

    ARGS=$(getopt -o i:o:t:s:ch -l 'input:,output:,threads:,seq_tech:,clean,help' -n 'prokaregia.sh' -- "$@")
    

    In any case, the -- will always be printed by getopt to let you know that there are no more options and from that point onwards you only have positional arguments.