Search code examples
pythonsubclassargparse

Custom conflict handling for ArgumentParser


What I need

I need an ArgumentParser, with a conflict handling scheme, that resolves some registered set of duplicate arguments, but raises on all other arguments.

What I tried

My initial approach (see also the code example at the bottom) was to subclass ArgumentParser, add a _handle_conflict_custom method, and then instantiate the subclass with ArgumentParser(conflict_handler='custom'), thinking that the _get_handler method would pick it up.

The Problem

This raises an error, because the ArgumentParser inherits from _ActionsContainer, which provides the _get_handler and the _handle_conflict_{strategy} methods, and then internally instantiates an _ArgumentGroup (that also inherits from _ActionsContainer), which in turn doesn't know about the newly defined method on ArgumentParser and thus fails to get the custom handler.

Overriding the _get_handler method is not feasible for the same reasons.

I have created a (rudimentary) class diagram illustrating the relationships, and therefore hopefully the problem in subclassing ArgumentParser to achieve what I want.

class_diagram.png

Motivation

I (think, that I) need this, because I have two scripts, that handle distinct parts of a workflow, and I would like to be able to use those separately as scripts, but also have one script, that imports the methods of both of these scripts, and does everything in one go.

This script should support all the options of the two individual scripts, but I don't want to duplicate the (extensive) argument definitions, so that I would have to make changes in multiple places.
This is easily solved by importing the ArgumentParsers of the (part) scripts and using them as parents, like so combined_parser = ArgumentParser(parents=[arg_parser1, arg_parser2]).

In the scripts I have duplicate options, e.g. for the work directory, so I need to resolve those conflicts.
This could also be done, with conflict_handler='resolve'.

But because there are a lot of possible arguments (which is not up to our team, because we have to maintain compatibility), I also want the script to raise an error if something gets defined that causes a conflict, but hasn't been explicitly allowed to do so, instead of quietly overriding the other flag, potentially causing unwanted behavior.

Other suggestions to achieve these goals (keeping both scripts separate, enabling use of one script that wraps both, avoiding code duplication and raising on unexpected duplicates) are welcome.

Example Code

from argparse import ArgumentParser


class CustomParser(ArgumentParser):
    def _handle_conflict_custom(self, action, conflicting_actions):
        registered = ['-h', '--help', '-f']
        conflicts = conflicting_actions[:]

        use_error = False
        while conflicts:
            option_string, action = conflicts.pop()
            if option_string in registered:
                continue
            else:
                use_error = True
                break

        if use_error:
            self._handle_conflict_error(action, conflicting_actions)
        else:
            self._handle_conflict_resolve(action, conflicting_actions)


if __name__ == '__main__':
    ap1 = ArgumentParser()
    ap2 = ArgumentParser()

    ap1.add_argument('-f')  # registered, so should be resolved
    ap2.add_argument('-f')

    ap1.add_argument('-g')  # not registered, so should raise
    ap2.add_argument('-g')

    # this raises before ever resolving anything, for the stated reasons
    ap3 = CustomParser(parents=[ap1, ap2], conflict_handler='custom')


Other questions

I am aware of these similar questions:

But even though some of them provide interesting insights into argparse usage and conflicts, they seem to address issues that are not related to mine.


Solution

  • For a various reasons -- notably the needs of testing -- I have adopted the habit of always defining argparse configuration in the form of a data structure, typically a sequence of dicts. The actual creation of the ArgumentParser is done in a reusable function that simply builds the parser from the dicts. This approach has many benefits, especially for more complex projects.

    If each of your scripts were to shift to that model, I would think that you might be able to detect any configuration conflicts in that function and raise accordingly, thus avoiding the need to inherit from ArgumentParser and mess around with understanding its internals.

    I'm not certain I understand your conflict-handling needs very well, so the demo below simply hunts for duplicate options and raises if it sees one, but I think you should be able to understand the approach and assess whether it might work for your case. The basic idea is to solve your problem in the realm of ordinary data structures rather than in the byzantine world of argparse.

    import sys
    import argparse
    from collections import Counter
    
    OPTS_CONFIG1 = (
        {
            'names': 'path',
            'metavar': 'PATH',
        },
        {
            'names': '--nums',
            'nargs': '+',
            'type': int,
        },
        {
            'names': '--dryrun',
            'action': 'store_true',
        },
    )
    
    OPTS_CONFIG2 = (
        {
            'names': '--foo',
            'metavar': 'FOO',
        },
        {
            'names': '--bar',
            'metavar': 'BAR',
        },
        {
            'names': '--dryrun',
            'action': 'store_true',
        },
    )
    
    def main(args):
        ap = define_parser(OPTS_CONFIG1, OPTS_CONFIG2)
        opts = ap.parse_args(args)
        print(opts)
    
    def define_parser(*configs):
        # Validation: adjust as needed.
        tally = Counter(
            nm
            for config in configs
            for d in config
            for nm in d['names'].split()
        )
        for k, n in tally.items():
            if n > 1:
                raise Exception(f'Duplicate argument configurations: {k}')
    
        # Define and return parser.
        ap = argparse.ArgumentParser()
        for config in configs:
            for d in config:
                kws = dict(d)
                xs = kws.pop('names').split()
                ap.add_argument(*xs, **kws)
        return ap
    
    if __name__ == '__main__':
        main(sys.argv[1:])