Search code examples
pythonpytorch

Why does pytorch tensor.item() give an unprecise output to any real number input but give a precise output to a number that ends with .0 or .5?


Input:

x = torch.tensor([3.5])

print(x)
print(x.item())
print(float(x))

Output: (tensor([3.5000]), 3.5, 3.5)

But if I change torch.tensor parameter to any other value rather than some_value.5, it gives a unprecise output:

Input:

x = torch.tensor([3.1])

print(x)
print(x.item())
print(float(x))

Output: (tensor([3.1000]), 3.0999999046325684, 3.0999999046325684)


Solution

  • What you experience has not much to do with PyTorch itself, but rather with the representation of real numbers as floating point values on a modern computer. The underlying situation here is the following:

    1. Floating point numbers are internally represented in base 2 (in other words, as binary numbers) with a fixed number of bits.
    2. Just like the fraction 1/3, which cannot be exactly represented with a fixed number of digits in base 10 (it is 0.333…10 with an infinite number of repetitions of the digit 3, where I use …10 to indicate the base of the number), the fractional part of 3.1, which is 1/10, cannot be exactly represented in base 2 (it is 0.0001100110011…2 with an infinite number or repetitions of the sequence 0011). As you experienced, this is in contrast to the fractional part of 3.5, which is 1/2, which can be exactly represented in base 2 (it is 0.12).
    3. Having only a fixed number of bits available (see 1), this means that the fraction 1/10 can only be stored as an approximated floating point value on the computer.

    It is true that this appears to vanish once you switch from 32-bit floating point values to 64-bit floating point values, as suggested in Karl's answer, as a 64-bit floating point value has twice as many bits for representing each value. However, this is not because the underlying situation has been resolved, but rather because your Python interpreter now knows, in a way, that it has to hide the situation from you.

    Try, for example, printing f"{torch.tensor([3.1], dtype=torch.float64).item():.30f} or, in fact, f"{3.1:.30f}", thereby enforcing to see more digits than Python, by default, wants to show you:

    import torch
    
    print(f"{torch.tensor([3.1], dtype=torch.float64).item():.30f}")
    # >>> 3.100000000000000088817841970013
    print(f"{3.1:.30f}")
    # >>> 3.100000000000000088817841970013
    

    You will see that also for the 64-bit floating point case, the floating point representation of the number 3.1 is only an approximation of the true value.

    Likewise, you can try the same with 3.5 and you will find that the value is stored exactly:

    import torch
    
    print(f"{torch.tensor([3.5], dtype=torch.float64).item():.30f}")
    # >>> 3.500000000000000000000000000000
    print(f"{3.5:.30f}")
    # >>> 3.500000000000000000000000000000
    

    The next is probably a bit of a bold claim and, depending on what data you work with, a huge oversimplification or even plain wrong, but still: In your daily life as a programmer, I would say, you rarely have to deal with this knowledge but simply ignore the fact that floating point values are only approximations in most cases. There are also situations where precise floating point arithmetic matters, however, and solutions such as the decimal module and its Decimal class have been designed for that. In some other situations, such as the comparison of floating point numbers, it may be appropriate to allow for some tolerance, e.g. by using PyTorch's allclose() function instead of an equality comparison (torch.allclose(t1, t2) instead of (t1 == t2).all()).

    For further general information on the properties of floating-point arithmetic, you might want to have a look at Floating-Point Arithmetic: Issues and Limitations from the Python documentation, or at the somewhat standard article on the subject, What Every Computer Scientist Should Know About Floating-Point Arithmetic.