Search code examples
pythonparametersargumentsdefaultdict

Python - When can you pass a positional argument by name, and when can't you?


The Python 2.7.5 collections.defaultdict only seems to work when you pass default_factory as a positional argument -- it breaks when you pass it as a named parameter.

If you run the following code you'll see that default_dict_success() runs fine, but default_dict_failure() throws a KeyError.

from collections import defaultdict

test_data = [
    ('clay', 'happy'),
    ('jason', 'happy'),
    ('aj', 'sad'),
    ('eric', 'happy'),
    ('sophie', 'sad')
]

def default_dict_success():
    results = defaultdict(list)
    for person, mood in test_data:
        results[mood].append(person)
    print results


def default_dict_failure():
    results = defaultdict(default_factory=list)
    for person, mood in test_data:
        results[mood].append(person)
    print results


default_dict_success()
default_dict_failure()

The output is

# First function succeeds
defaultdict(<type 'list'>, {'sad': ['aj', 'sophie'], 'happy': ['clay', 'jason', 'eric']})

# Second function fails
Traceback (most recent call last):
  File "test_default_dict.py", line 26, in <module>
    default_dict_failure()
  File "test_default_dict.py", line 21, in default_dict_failure
    results[mood].append(person)
KeyError: 'happy'

Anyone know what's going on?

EDIT: Originally I thought I was looking at some Python source that would've suggested what I was trying to do was possible, but the commenters pointed out that I was mistaken, since this object is implemented in C and therefore there is no Python source for it. So it's not quite as mysterious as I thought.

That having been said, this is the first time I've come across positional argument in Python that couldn't also be passed by name. Does this type of thing happen anywhere else? Is there a way to implement a function in pure Python (as opposed to a C extension) that enforces this type of behavior?


Solution

  • I think the docs try and say this is what will happen, although they aren't particularly clear:

    The first argument provides the initial value for the default_factory attribute; it defaults to None. All remaining arguments are treated the same as if they were passed to the dict constructor, including keyword arguments.

    Emphasis mine. The "first argument" wouldn't be a keyword argument (they have no order). That said, filing a documentation bug wouldn't be a bad idea.

    That having been said, this is the first time I've come across positional argument in Python that couldn't also be passed by name. Does this type of thing happen anywhere else? Is there a way to implement a function in pure Python (as opposed to a C extension) that enforces this type of behavior?

    This is actually so common there's a whole PEP about it. Consider range as a simple example.

    With regards to doing this yourself,

    Functions implemented in modern Python can accept an arbitrary number of positional-only arguments, via the variadic *args parameter. However, there is no Python syntax to specify accepting a specific number of positional-only parameters. Put another way, there are many builtin functions whose signatures are simply not expressable with Python syntax.

    It is possible to do something like

    def foo(*args):
        a, b, c = args
    

    This is mentioned in the PEP:

    Obviously one can simulate any of these in pure Python code by accepting (*args, **kwargs) and parsing the arguments by hand. But this results in a disconnect between the Python function's signature and what it actually accepts, not to mention the work of implementing said argument parsing.