I am trying two multiply to matrices in C and I cant understand why I get these results...
I want to do : Btranspose * B
#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
#include <math.h>
#define LOW_WORD(x) (((x) << 16) >> 16)
#define HIGH_WORD(x) ((x) >> 16)
#define ABS(x) (((x) >= 0) ? (x) : -(x))
#define SIGN(x) (((x) >= 0) ? 1 : -1)
#define UNSIGNED_MULT(a, b) \
(((LOW_WORD(a) * LOW_WORD(b)) << 0) + \
(((int64_t)((LOW_WORD((a)) * HIGH_WORD((b))) + (HIGH_WORD((a)) * LOW_WORD((b))))) << 16) + \
((int64_t)(HIGH_WORD((a)) * HIGH_WORD((b))) << 32))
#define MULT(a, b) (UNSIGNED_MULT(ABS((a)), ABS((b))) * SIGN((a)) * SIGN((b)))
int main()
{
int c,d,k;
int64_t multmatrix[3][3];
int64_t sum64 = 0;
int32_t Btranspose[3][3] = {{15643, 24466, 58751},
{54056, 26823, -25563},
{-33591, 54561, -13777}};
int32_t B[3][3] = {{15643, 54056, -33591},
{24466, 26823, 54561},
{58751, -25563, -13777}};
for ( c = 0 ; c < 3 ; c++ ){
for ( d = 0 ; d < 3 ; d++ ){
for ( k = 0 ; k < 3 ; k++ ){
sum64 = sum64 + MULT(Btranspose[c][k], B[k][d]);
printf("\n the MULT for k = %d is: %ld \n", k, MULT(Btranspose[c][k], B[k][d]));
printf("\n the sum for k = %d is: %ld \n", k, sum64);
}
multmatrix[c][d] = sum64;
sum64 = 0;
}
}
printf("\n\n multmatrix \n");
for( c = 0 ; c < 3; c++ ){
printf("\n");
for( d = 0 ; d < 3 ; d++ ){
printf(" %ld ", multmatrix[c][d]);
}
}
return 0;
}
My output is below put that is wrong and I notice that the mistake is when is multiplying the 3rd element (58751 * 58751) for k=2. I think is not overflowing because 58751^2 needs 32bits.
the MULT for k = 0 is: 244703449 the sum for k = 0 is: 244703449 the MULT for k = 1 is: 598585156 the sum for k = 1 is: 843288605 the MULT for k = 2 is: 46036225 // this is WRONG!!! the sum for k = 2 is: 889324830 . . . . the MULT for k = 2 is: 189805729 the sum for k = 2 is: 1330739379 multmatrix 889324830 650114833 324678230 650114833 1504730698 -308929574 324678230 -308929574 1330739379
Correct result should be
multmatrix - correct
4.2950e+09 -2.2870e+03 1.2886e+04
-2.2870e+03 4.2950e+09 -1.2394e+05
1.2886e+04 -1.2394e+05 4.2951e+09
Why is the multiplication of the matrix wrong?? What should I change the above code so that the multiplication of two matrices will be overflow-proof??
(I am trying write a program that multiplies two 32 bits numbers to be imported on a system that has only 32 bit registers)
So according to the answer below this actually works.
#define LOW_WORD(x) ((uint32_t)(x) & 0xffff)
#define HIGH_WORD(x) ((uint32_t)(x) >> 16)
#define ABS(x) (((x) >= 0) ? (x) : -(x))
#define SIGN(x) (((x) >= 0) ? 1 : -1)
#define UNSIGNED_MULT(a, b) \
(((LOW_WORD(a) * LOW_WORD(b)) << 0) + \
((int64_t)(LOW_WORD(a) * HIGH_WORD(b) + HIGH_WORD(a) * LOW_WORD(b)) << 16) + \
((int64_t)(HIGH_WORD((a)) * HIGH_WORD((b))) << 32))
#define MULT(a, b) (UNSIGNED_MULT(ABS((a)), ABS((b))) * SIGN((a)) * SIGN((b)))
Thank you for helping me understand some things! I'll try turning the whole thing to functions and posting it back.
This
(((x) << 16) >> 16)
doesn't produce unsigned 16-bit number, as you might expect. The type of this expression is the same as the type of x
, which is int32_t
(signed integer). Indeed, if using any sensible (two's complement) C implementation, for x=58751
:
x = 00000000000000001110010101111111
(x) << 16 = 11100101011111110000000000000000 (negative number)
(((x) << 16) >> 16) = 11111111111111111110010101111111 (negative number)
To extract the low 16 bits properly, use unsigned arithmetic:
((uint32_t)(x) & 0xffff)
or (preserving your style)
((uint32_t)(x) << 16 >> 16)
To get the high word, you have to use unsigned arithmetic too:
((uint32_t)(x) >> 16)
Also, the compiler might need help determining the range of this expression (to do optimizations):
(uint16_t)((uint32_t)(x) & 0xffff)
Some (all?) compilers are smart enough to do that by themselves though.
Also, as noted by doynax, the product of low word and high word is a 32-bit number (or 31-bit, but it doesn't matter). To shift it left by 16 bits, you have to cast it to a 64-bit type, just like you do it with the high words:
((int64_t)(LOW_WORD(a) * HIGH_WORD(b) + HIGH_WORD(a) * LOW_WORD(b)) << 16)