Can you describe difference between two ways of string concatenation: simple __add__
operator and %s
patterns?
I had some investigation in this question and found %s
(in form without using parentheses) a little faster.
Also another question was appeared: why result of 'hell%s' % 'o'
refers to another memory region than 'hell%s' % ('o',)
?
There is some code example:
l = ['hello', 'hell' + 'o', 'hell%s' % 'o', 'hell%s' % ('o',)]
print [id(s) for s in l]
Result:
[34375618400, 34375618400, 34375618400, 34375626256]
P.S. I know about string interning :)
Here is a small exercise:
>>> def f1():
'hello'
>>> def f2():
'hel' 'lo'
>>> def f3():
'hel' + 'lo'
>>> def f4():
'hel%s' % 'lo'
>>> def f5():
'hel%s' % ('lo',)
>>> for f in (f1, f2, f3, f4, f5):
print(f.__name__)
dis.dis(f)
f1
1 0 LOAD_CONST 1 (None)
3 RETURN_VALUE
f2
1 0 LOAD_CONST 1 (None)
3 RETURN_VALUE
f3
2 0 LOAD_CONST 3 ('hello')
3 POP_TOP
4 LOAD_CONST 0 (None)
7 RETURN_VALUE
f4
2 0 LOAD_CONST 3 ('hello')
3 POP_TOP
4 LOAD_CONST 0 (None)
7 RETURN_VALUE
f5
2 0 LOAD_CONST 1 ('hel%s')
3 LOAD_CONST 3 (('lo',))
6 BINARY_MODULO
7 POP_TOP
8 LOAD_CONST 0 (None)
11 RETURN_VALUE
As you can see, all simple concatenations/formatting are done by compiler. The last function requires more complex formatting and therefore, I guess, is actually executed. Since all those object created at compilation time they all have the same id.