Search code examples
python-3.xstring-comparison

Why does Python string fail to compare correctly?


I am comparing two strings in Python. But the comparison fails to see that the values of variables firmware and hardware are the same as the strings "firmware" and "hardware".

gfirmware = create_string_buffer(str.encode("firmware"), 100)
ghardware = create_string_buffer(str.encode("hardware"), 100)
firmware = str(gfirmware,'utf-8')
hardware = str(ghardware,'utf-8')

print('firmware var = ' + firmware)
print('hardware var = ' + hardware)
print("\n")
print('firmware type = ' + str(type(firmware)))
print('hardware type = ' + str(type(hardware)))
print('"firmware" type = ' + str(type("firmware")))
print('"hardware" type = ' + str(type("hardware")))

print("Is it true? " + str(firmware != "firmware" and hardware != "hardware"))

Output:

firmware var = firmware
hardware var = hardware

firmware type = <class 'str'>
hardware type = <class 'str'>
"firmware" type = <class 'str'>
"hardware" type = <class 'str'>
Is it true? True

The values and the types of the variables and the strings are the same, as can be seen in the output.

So why does the comparison firmware != "firmware" and hardware != "hardware" return True, it should be returning False?

Note: I am intentionally using create_string_buffer() because I am passing gfirmware and ghardware into a C function. But this issue occurs even though I am not passing the variables into a C function.

I have looked at the following and other posts, but their issues were that the programmer was using the keyword is when they should have been using ==.

Why does comparing strings using either '==' or 'is' sometimes produce a different result?

python fails to compare strings

Strange behavior when comparing unicode objects with string objects


Solution

  • Your gfirmware and ghardware objects are large character buffers. When you convert them to strings with str(gfirmware,'utf-8') you get large strings:

    >>> len(str(gfirmware, 'utf-8'))
    100
    

    because you still have all the padding.

    You can use the value property on the buffers before converting to a string:

    >> firmware = str(gfirmware.value,'utf-8')
    >> hardware = str(ghardware.value,'utf-8')
    >> firmware != "firmware", hardware != "hardware"
    (False, False)