Search code examples
haskellghccoercion

show returning wrong value when used with unsafeCoerced value


I was experimenting with unsafeCoerce with Int8 and Word8, and I found some surprising behaviour (for me anyway).

Word8 is a 8 bit unsigned number that ranges from 0-255. Int8 is a signed 8 bit number that ranges from -128..127.

Since they are both 8 bit numbers, I assumed that coercing one to another would be safe, and just return the 8 bit values as if it was signed/unsigned.

For example, unsafeCoerce (-1 :: Int8) :: Word8 I would expect to result in a Word8 value of 255 (since the bit representation of -1 in a signed int is the same as 255 in an unsigned int).

However, when I do perform the coerce, the Word8 the behaviour is strange:

> GHCi, version 7.4.1: http://www.haskell.org/ghc/  :? for help
> import Data.Int
> import Data.Word
> import Unsafe.Coerce
> class ShowType a where typeName :: a -> String
> instance ShowType Int8 where typeName _ = "Int8"
> instance ShowType Word8 where typeName _ = "Word8"

> let x = unsafeCoerce (-1 :: Int8) :: Word8
> show x
"-1"
> typeName x
"Word8"
> show (x + 0)
"255"
> :t x
x :: Word8
> :t (x + 0)
(x + 0) :: Word8

I don't understand how show x is returning "-1" here. If you look at map show [minBound..maxBound :: Word8], no possible value for Word8 results in "-1". Also, how does adding 0 to the number change the behaviour, even if the type isn't changed? Strangely, it also appears it is only the Show class that is affected - my ShowType class returns the correct value.

Finally, the code fromIntegral (-1 :: Int8) :: Word8 works as expected, and returns 255, and works correctly with show. Is/can this code be reduced to a no-op by the compiler?

Note that this question is just out of curiosity about how types are represented in ghc at a low level. I'm not actually using unsafeCoerce in my code.


Solution

  • Like @kosmikus said, both Int8 and Int16 are implemented using an Int#, which is 32 bit-wide on 32-bit architectures (and Word8 and Word16 are Word# under the hood). This comment in GHC.Prim explains this in more detail.

    So let's find out why this implementation choice results in the behaviour you see:

    > let x = unsafeCoerce (-1 :: Int8) :: Word8
    > show x
    "-1"
    

    The Show instance for Word8 is defined as

    instance Show Word8 where
        showsPrec p x = showsPrec p (fromIntegral x :: Int)
    

    and fromIntegral is just fromInteger . toInteger. The definition of toInteger for Word8 is

    toInteger (W8# x#)            = smallInteger (word2Int# x#)
    

    where smallInteger (defined in integer-gmp) is

    smallInteger :: Int# -> Integer
    smallInteger i = S# i
    

    and word2Int# is a primop with type Word# -> Int# - an analog of reinterpret_cast<int> in C++. So that explains why you see -1 in the first example: the value is just reinterpreted as a signed integer and printed out.

    Now, why would adding 0 to x give you 255? Looking at the Num instance for Word8 we see this:

    (W8# x#) + (W8# y#)    = W8# (narrow8Word# (x# `plusWord#` y#))
    

    So it looks like the narrow8Word# primop is the culprit. Let's check:

    > import GHC.Word
    > import GHC.Prim
    > case x of (W8# w) -> (W8# (narrow8Word# w))
    255
    

    Indeed it is. That explains why adding 0 is not a no-op - Word8 addition actually clamps down the value to the intended range.