Search code examples
javascriptpythonperformancestring-concatenation

Why is 'join' faster than normal concatenation?


I've seen several examples from different languages that unambiguously prove that joining elements of a list (array) is many times faster than just concatenating string. Why?

What is the inner algorithm that works under both operations and why is the one faster than another?

Here is a Python example of what I mean:

# This is slow
x = 'a'
x += 'b'
...
x += 'z'

# This is fast
x = ['a', 'b', ... 'z']
x = ''.join(x)

Solution

  • The code in a join function knows upfront all the strings it’s being asked to concatenate and how large those strings are, and hence it can calculate the final string length before beginning the operation.

    Hence it needs only allocate memory for the final string once and then it can place each source string (and delimiter) in the correct place in memory.

    On the other hand, a single += operation on a string has no choice but to simply allocate enough memory for the final string which is the concatenation of just two strings. Subsequent +='s must do the same, each allocating memory which on the next += will be discarded. Each time the evergrowing string is copied from one place in memory to another.