I would like to reduce the complexity of the following algorithm. Basically, it takes a word as an input and calculates the number of unique letters within it (the "entropy" of the word). My current solution employs 3 embedded for loops, which comes out to a complexity of o(n^3). Since this code is part of a bigger project (we built a solver for the game known as boggle), I was hoping to reduce the complexity of my algorithm in order to reduce its execution time. Thanks in advance!
int wordEntropy(string word)
{
int length = word.length();
int uniquewords = length;
string compare = word;
char save[17];
int cond=0;
for (int ii=0; ii < length; ii++)
{
for (int jj=ii+1; jj < length; jj++)
{
for (int kk=0; kk<= ii; kk++)
{
if (save[kk] == word[ii]) {cond++;}
}
if (word[ii] == word[jj])
{
if (cond>0) {break;}
uniquewords--;
}
}
save[ii] = word[ii];
cond = 0;
}
return uniquewords;
}
If this is really about performance, depending on the range of valid characters something like this may be faster:
std::size_t wordEntropy( const std::string & word )
{
unsigned char seen[256] = { 0 };
for( unsigned char c : word )
{
++seen[ c ];
}
return std::count_if( & seen[0], & seen[ 0 ] + 256,
[]( unsigned char c ) { return c != 0; } );
}
But obviously, this is a little bit harder to maintain. This solution has guaranteed complexity of O(n) and it does not make any dynamic memory allocations.
Alternative version that does not have problems if a character occurs more than 255 times:
std::size_t wordEntropy( const std::string & word )
{
bool seen[256] = { false };
for( unsigned char c : word )
{
seen[ c ] = true;
}
return std::count_if( & seen[0], & seen[ 0 ] + 256,
[]( bool t ) { return t; } );
}