Search code examples
pythonstring

What is the most efficient string concatenation method in Python?


Is there an efficient mass string concatenation method in Python (like StringBuilder in C# or StringBuffer in Java)?

I found following methods here:

  • Simple concatenation using +
  • Using a string list and the join method
  • Using UserString from the MutableString module
  • Using a character array and the array module
  • Using cStringIO from the StringIO module

What should be used and why?

(A related question is here.)


Solution

  • If you know all components beforehand once, use the literal string interpolation, also known as f-strings or formatted strings, introduced in Python 3.6.

    Given the test case from mkoistinen's answer, having strings

    domain = 'some_really_long_example.com'
    lang = 'en'
    path = 'some/really/long/path/'
    

    The contenders and their execution time on my computer using Python 3.6 on Linux as timed by IPython and the timeit module are

    • f'http://{domain}/{lang}/{path}' - 0.151 µs

    • 'http://%s/%s/%s' % (domain, lang, path) - 0.321 µs

    • 'http://' + domain + '/' + lang + '/' + path - 0.356 µs

    • ''.join(('http://', domain, '/', lang, '/', path)) - 0.249 µs (notice that building a constant-length tuple is slightly faster than building a constant-length list).

    Thus the shortest and the most beautiful code possible is also fastest.


    The speed can be contrasted with the fastest method for Python 2, which is + concatenation on my computer; and that takes 0.203 µs with 8-bit strings, and 0.259 µs if the strings are all Unicode.

    (In alpha versions of Python 3.6 the implementation of f'' strings was the slowest possible - actually the generated byte code is pretty much equivalent to the ''.join() case with unnecessary calls to str.__format__ which without arguments would just return self unchanged. These inefficiencies were addressed before 3.6 final.)