python performance python-3.x namedtuple

namedtuple field names: single string or sequence?

Python allows to provide field_names in the declaration of a namedtuple either as a sequence of strings or as a single string with each name separated by whitespace and/or commas.

According to the official documentation, it seems that the preferred way in Python 2 was to provide the names as a sequence:

field_names are a sequence of strings such as ['x', 'y']. Alternatively, field_names can be a single string with each fieldname separated by whitespace and/or commas, for example 'x y' or 'x, y'.

while in Python 3 the preference changed to the single string version:

field_names are a single string with each fieldname separated by whitespace and/or commas, for example 'x y' or 'x, y'. Alternatively, field_names can be a sequence of strings such as ['x', 'y'].

Is there a reason behind that?

At a first sight, I would say that the single string version is less efficient as it requires to split the input. The sequence seems also more readable to me. Which one is more efficient?

Solution

Yes, providing a str involves a .replace and .split before mapping its contents to strs, see source:

if isinstance(field_names, str):
    field_names = field_names.replace(',', ' ').split()
field_names = list(map(str, field_names))

This obviously takes a bit more time than if you supplied a list. Though, this should never be a performance bottleneck, it is only executed during the initial call to namedtuple which generates the class; consequent calls don't have to do anything with it. In short, don't worry about performance here.