I am trying to make a code for this problem:
(Source: https://www.codewars.com/kata/insane-coloured-triangles/train/c)
A coloured triangle is created from a row of colours, each of which is red, green or blue. Successive rows, each containing one fewer colour than the last, are generated by considering the two touching colours in the previous row. If these colours are identical, the same colour is used in the new row. If they are different, the missing colour is used in the new row. This is continued until the final row, with only a single colour, is generated.
For example, different possibilities are:
Colour here: G G B G R G B R Becomes colour here: G R B G With a bigger example: R R G B R G B B R B R G B R B G G B R G G G R G B G B B R R B G R R B G
You will be given the first row of the triangle as a string and its your job to return the final colour which would appear in the bottom row as a string. In the case of the example above, you would be given
'RRGBRGBB'
, and you should return'G'
.Constraints:
1 <= length(row) <= 10 ** 5
The input string will only contain the uppercase letters '
B', 'G' or 'R'
.The exact number of test cases will be as follows:
100 tests of 100 <= length(row) <= 1000 100 tests of 1000 <= length(row) <= 10000 100 tests of 10000 <= length(row) <= 100000
Examples:
triangle('B') == 'B' triangle('GB') == 'R' triangle('RRR') == 'R' triangle('RGBG') == 'B' triangle('RBRGBRB') == 'G' triangle('RBRGBRBGGRRRBGBBBGG') == 'G'
So I have made this code and it works for all the sample tastes but when it comes for length of row > 25
it fails due my factorial function,and the length in some cases is up to 100,000
, so any suggestion to fix this problem at least any mathematic formula that can solve the problem or a small hint.
by the way I have made this code after I found a mathematic way to solve the problem in this site :
https://math.stackexchange.com/questions/2257158/three-color-triangle-challenge
long long factorial(long long num)
{
long long fact = num;
if (fact == 0)
fact = 1;
while (num > 1)
{
num--;
fact *= num;
}
return (fact);
}
long long combination(long long n, long long k, long long fact_n)
{
long long fact_k = factorial(k);
long long comb = ( fact_n / (fact_k * factorial(n - k)) );
return (comb);
}
char triangle(const char *row)
{
long long len = strlen(row);
long long n = len - 1;
long long k = 0;
int sign = pow(-1,n);
long long sum = 0;
char *result = "RGB";
int a;
long long fact_n = factorial(n);
while (k < len) //This while calculate Segma (∑).
{
if (row[k] == 'R')
a = 0;
else if (row[k] == 'G')
a = 1;
else if (row[k] == 'B')
a = 2;
if (a != 0)
sum = sum + (combination(n,k,fact_n) * a);
k++;
}
sum = sign * (sum % 3);
if (sum == -1 || sum == -2)
sum += 3;
return (result[sum]);
}
I will assume that the formula in the link you provided is correct:
In order to avoid integer overflow, we will need to apply these modulo arithmetic rules:
(a * b) mod c = ((a mod c) * (b mod c)) mod c
(a ± b) mod c = ((a mod c) ± (b mod c)) mod c
Applying them to the formula:
But how do we calculate
... without directly calculating the coefficient itself (which can cause overflow)?
Since 3 is a prime number, this can be accomplished with Lucas's theorem:
... where n_i, m_i
are the i
-th digits of n, m
in base-3.
Conversion to base-3 is easy with integer division:
// convert a number to base 3
// and returns the number of digits
unsigned conv_base_3(unsigned n, unsigned max, unsigned* out)
{
unsigned i = 0;
while (i < max && n > 0)
{
out[i] = n % 3;
n /= 3;
i++;
}
return i;
}
Note that since n_i, m_i
are always in the range [0, 2]
(because they are base-3 digits), C(n_i, m_i)
are very easy to calculate:
// calculate the binomial coefficient for n < 3
unsigned binom_max_2(unsigned n, unsigned k)
{
if (n < k)
return 0;
switch (n)
{
case 0:
case 1:
return 1;
case 2:
return 1 + (k == 1);
// shouldn't happen
default:
return 0;
}
}
And now the theorem itself:
// Lucas's theorem for p = 3
unsigned lucas_3(unsigned len_n, const unsigned * dig_n,
unsigned len_k, const unsigned * dig_k)
{
// use modulo product rule:
// prod[i] % 3 = ((prod[i - 1] % 3) * value[i])
unsigned prod = 1;
for (unsigned i = 0; i < len_n; i++) {
unsigned n_i = dig_n[i];
unsigned k_i = (i < len_k) ? dig_k[i] : 0;
prod = (prod * binom_max_2(n_i, k_i)) % 3;
}
return prod % 3;
}
Character conversion:
// convert from 012 to RGB
char int_2_char(int i)
{
switch (i) {
case 0: return 'R';
case 1: return 'G';
case 2: return 'B';
// shouldn't happen
default:
return '\0';
}
}
// convert from RGB to 012
unsigned char_2_int(char c)
{
switch (c) {
case 'R': return 0;
case 'G': return 1;
case 'B': return 2;
// shouldn't happen
default:
return 3;
}
}
Finally, the triangle algorithm:
// the problem constraints state that n <= 10 ** 5
// max number of base-3 digits
#define MAX_N_LOG_3 11
// main algorithm function
char triangle(const char * input)
{
unsigned sum = 0;
const int n = strlen(input);
// calculate digits of n - 1
unsigned dig_n[MAX_N_LOG_3];
unsigned len_n = conv_base_3(n - 1, MAX_N_LOG_3, dig_n);
for (unsigned km1 = 0; km1 < n; km1++)
{
// calculate digits of k - 1
unsigned dig_k[MAX_N_LOG_3];
unsigned len_k = conv_base_3(km1, MAX_N_LOG_3, dig_k);
// calculate C(n - 1, k - 1) mod 3
unsigned Cnk_mod3 = lucas_3(len_n, dig_n, len_k, dig_k);
// add using the modulo rule
sum = (sum + Cnk_mod3 * char_2_int(input[km1])) % 3;
}
// value of (-1) ** (n - 1)
// no need for pow; just need to know if n is odd or even
int sign = (n % 2) * 2 - 1;
// for negative numbers, must resolve the difference
// between C's % operator and mathematical mod
int sum_mod3 = (3 + (sign * (int)(sum % 3))) % 3;
return int_2_char(sum_mod3);
}
The above code passes all tests. Note that it was written in favor of clarity, not performance.
So why was this code able to pass all tests in the allocated time, whereas the simple table-based approach wasn't? Because of its time complexity:
The table-based approach processes all levels of the triangle, which is O(n^2)
(see Triangle Numbers).
Of course, using Lucas's algorithm, only the top level has to be processed; however the algorithm itself is O(log n)
, because it loops through every digit of n
(regardless of the base). The overall complexity is O(n log n)
, which still represents a significant improvement.