I am currently writing a program where given two input matrices (int8_t and float respectively) I compute the multiplication of the two.
For memory reasons, I do not want the entire int8 matrix to be converted to a floating type (or any type that occupies more than 8bits in memory). I believe that in C, when multiplying int with float, there is an implicit type casting that is done to convert int to float, in order to do the operation.
My question now is, if I cast my int8 input as a float, then do the computation, what is actually happening in the memory ? Is it over-writing other useless memory spaces when the cast is done, or is it taking additional place as if I created a float array in which I copied my data ?
for (int i = 0; i < n; ++i) {
for (int j = 0; j < m; ++j) {
float sum = 0;
for (int l = 0; l < k; ++l) {
sum += (float) input_a[i*k + l] * input_b[l*m + j];
}
output[i*m + j] = sum;
}
}
sum += (float) input_a[i*k + l] * input_b[l*m + j];
specifies a computation to be performed. It overtly says to fetch element i*k + l
of input_a
, convert it to float
, fetch element l*m + j
of input_b
, multiply these (including an implicit conversion of the second operand to float
), add them to sum
, and store the result in sum
.
Nothing in this says to store anything into any memory other than sum
. The C standard allows a compiler to implement this computation in any way that does not alter the observed behavior of the program, which consists of its output, its input/output interactions, and accesses to volatile objects. With most compilers and most processors, the compiler will generate code to perform this operation entirely in processor registers:
float
in processor registers.sum
in memory or compiler optimization will keep sum
in a processor registers until the final output[i*m + j] = sum;
is performed.In somewhat unusual, yet not extraordinary circumstances, the compiler may use additional memory:
In no ordinary C implementation would the compiler generate an entire array of float
elements to hold the various values that this code converts to float
.