python performance cpython python-internals

Isn't CPython's str.join() a little inefficient?

This answer and its comments provide some insight into the inner working's of CPython's str.join():

If the argument is not already a list or a tuple, a new list is created with the same contents.
The argument is iterated over once, to sum the lengths of the strings it holds.
Memory is allocated for a new string.
Finally, the argument is iterated over a second time, and the strings are copied into the memory for the new string.

This seems questionable to me. For starters, why reject all sequence types but two? Wouldn't just iterating over any sequence twice instead of copying it be much faster? And why make a list, particularly if you can't know the length of the iterable you're making it from? You don't need random access, just repeated iteration, and using a list means you might have to reallocate and copy several times during its generation. Wouldn't it make more sense to use a linked list or a deque?

Can anyone provide some insights into these design decisions?

Solution

For starters, why reject all sequence types but two? Wouldn't just iterating over any sequence twice instead of copying it be much faster?

The argument of join need not be a sequence. It be any iterable, and some iterables cannot be iterated over more than once. It could, for instance, be a generator expression, which would be exhausted after iterating over it once.

As to your second question, I don't know specifically, although I'd guess that using lists and tuples internally simplifies the implementation at the C level. I think the broader answer to your question is that no one was really intending to make every possible optimization to str.join. I would guess that the vast majority of use cases are calling it on a list or tuple anyway.