Search code examples
pythonstring-comparison

Why string comparison is NOT faster then integer comparison in Python?


The difference in C++ is huge, but not in Python. I used similar code on C++, and the result is so different -- integer comparison is 20-30 times faster than string comparison.

Here is my example code:

import random, time
rand_nums = []
rand_strs = []
total_num = 1000000
for i in range(total_num):
    randint = random.randint(0,total_num*10)
    randstr = str(randint)
    rand_nums.append(randint)
    rand_strs.append(randstr)

start = time.time()
for i in range(total_num-1):
    b = rand_nums[i+1]>rand_nums[i]
end = time.time()
print("integer compare:",end-start)     # 0.14269232749938965 seconds

start = time.time()
for i in range(total_num-1):
    b = rand_strs[i+1]>rand_strs[i]
end = time.time()                       # 0.15730643272399902 seconds
print("string compare:",end-start)

Solution

  • I can't explain why it's so slow in C++, but in Python, the reason is simple from your test code: random strings usually differ in the first byte, so the comparison time for those cases should be pretty much the same.

    Also, not that much of your overhead will be in the loop control and list accesses. You'd get a much more accurate measure if you remove those factors by zipping the lists:

    for s1, s2 in zip(rand_strs, rand_strs[1:]):
        b = s1 > s2