I realized a recursive function to solve this problem:
My function's idea is to do a recursive call for every letter to every possible direction and then concatenate the string that it is going to be formed. When the lenght of this string will be n, it stops. To prevent the returning back to a letter already used, I created and array of pointers to letter's class which compares every new possible letter with all the letters used. For doing this I created a special class for every letter.
This is the function: (sorry, it's very long)
dirSu means Up - dirGiu Down - dirDx right - dirSx left - and all the Others the diagonal positions
void decisione(lettera *start, lettera *next, int count, string seq, vector <lettera*> visitate, int quanti) {
if (next == nullptr) return ;
if (count == quanti) {
bool test = false;
for (int k = 0; k < start->sequenze.size(); k++) { //To prevent same letter to be used
if (start->sequenze[k] == seq) {
test = true;
break;
}
}
if (!test) start->sequenze.push_back(seq);
return ;
}
seq = seq + next->chiave; //String concatenating every call
count++; //Counter to decide when stop
visitate.push_back(next); //Array of used letters
int i;
bool test;
//Decide direction
test = false;
for(i = 0; i < visitate.size(); i++) {
if (visitate[i] == next->dirSu) {
test = true;
break;
}
}
if (!test) decisione(start, next->dirSu, count, seq, visitate, quanti);
test = false;
for(i = 0; i < visitate.size(); i++) {
if (visitate[i] == next->dirSu) {
test = true;
break;
}
}
if (!test) decisione(start, next->dirSu, count, seq, visitate, quanti);
test = false;
for(i = 0; i < visitate.size(); i++) {
if (visitate[i] == next->dirGiu) {
test = true;
break;
}
}
if (!test) decisione(start, next->dirGiu, count, seq, visitate, quanti);
test = false;
for(i = 0; i < visitate.size(); i++) {
if (visitate[i] == next->dirSx) {
test = true;
break;
}
}
if (!test) decisione(start, next->dirSx, count, seq, visitate, quanti);
test = false;
for(i = 0; i < visitate.size(); i++) {
if (visitate[i] == next->dirDx) {
test = true;
break;
}
}
if (!test) decisione(start, next->dirDx, count, seq, visitate, quanti);
test = false;
for(i = 0; i < visitate.size(); i++) {
if (visitate[i] == next->dirAdx) {
test = true;
break;
}
}
if (!test) decisione(start, next->dirAdx, count, seq, visitate, quanti);
test = false;
for(i = 0; i < visitate.size(); i++) {
if (visitate[i] == next->dirAsx) {
test = true;
break;
}
}
if (!test) decisione(start, next->dirAsx, count, seq, visitate, quanti);
test = false;
for(i = 0; i < visitate.size(); i++) {
if (visitate[i] == next->dirBdx) {
test = true;
break;
}
}
if (!test) decisione(start, next->dirBdx, count, seq, visitate, quanti);
test = false;
for(i = 0; i < visitate.size(); i++) {
if (visitate[i] == next->dirBsx) {
test = true;
break;
}
}
if (!test) decisione(start, next->dirBsx, count, seq, visitate, quanti);
}
Well, when I activate the function for words of 3 letters, it is really fast, 4 letters: 0.028s of calculation time per letter in the matrix, 5 letters: 0.225s, 6 letters: 1.891s, 7 letters: 14.914s.
Is there a way to reduce calculation time by optimizing my function? (or removing my stupid logical errors?) Thanks a lot!
The Matrix I used for my tests is:
C A S A
C O S A
B A C I
O T A T
I used this for every speed test (without calculating it by frequency or randomly) and with this (with my recursive function improved since my question) I get in about 90 seconds all the words with 5, 6, 7 letters for every starting letter. There are a lot of words, although it is only a 4x4 Matrix! If I can send it to you privately it could be interesting to see also your solution to the same problem. (I don't know how to contact you here)
Change your algorithm from recursive to iterative (with a loop). It's usually faster. After that, the other optimizations are related to the algorithm and the data structures, in order to reduce the complexity. It's much more difficult to find such optimizations (but is possible).
EDIT: some thoughts on the code, followed by a version of the word-search I've done today:
So, here is the code. It's a standalone, you just have to compile it, execute with the english (or italian or french) words file (text, one word per line):
#include <string>
#include <iostream>
#include <stdlib.h>
#include <vector>
#include <time.h>
#include <map>
#include <algorithm>
// **************************************************** //
// //
// Data structure for the words : dictionnary //
// //
// **************************************************** //
struct DicNode
{
const char l;
bool isEndOfWord;
DicNode* children[26]; // a - z, case insensitive
DicNode* parent;
unsigned int min, max, level;
DicNode(char c, DicNode* p = 0, bool end = false) : l(c),isEndOfWord(end),parent(p),min((unsigned int)(-1)),max(0), level(parent ? parent->level + 1 : 0)
{
for(unsigned int t = 0; t < 26; t++)
children[t] = 0;
}
~DicNode()
{
for(unsigned int t = 0; t < 26; t++)
if(children[t])
delete children[t];
}
DicNode*& at(char l)
{
if(l == 'à' || l == 'ä' || l == 'â')
l = 'a';
if(l == 'é' || l == 'è' || l == 'ê' || l == 'ë')
l = 'e';
if(l == 'ï' || l == 'î')
l = 'i';
if(l == 'ö' || l == 'ô')
l = 'o';
if(l == 'ü' || l == 'û' || l == 'ù')
l = 'u';
if(l == 'ç')
l = 'c';
if(l <= 'Z')
l = 'a' + l - 'A';
// Note: undefined behavior if l > 'Z' or l < 'A'.
return children[(unsigned int)(l - 'a')];
}
bool inRange(unsigned int n) const
{
return n <= max && n >= min;
}
void print(std::string str = "") const
{
// Here recursive algorithm is way better because of the object-oriented structure
if(isEndOfWord)
std::cout << (str + l) << "\n";
for(unsigned int t = 0; t < 26; t++)
if(children[t])
children[t]->print(str + l);
}
std::string toString() const
{
std::string str(level, '0');
const DicNode* node = this;
while(node && node->level > 0)
{
str[node->level - 1] = node->l;
node = node->parent;
}
return str;
}
};
void addWord(DicNode* root, const std::string& str)
{
if(!root || str == "") return;
DicNode* node = root;
unsigned int i = 0;
while(i != str.size())
{
if(str.size() - i > node->max) node->max = str.size() - i;
if(str.size() - i < node->min) node->min = str.size() - i;
char c = tolower(str[i]);
DicNode*& n = node->at(c);
if(!n) n = new DicNode(c,node);
node = n;
i++;
}
node->isEndOfWord = true;
}
// **************************************************** //
// //
// Data structures for the algorithm //
// //
// **************************************************** //
enum Direction
{
INVALID,
NORTH,
NORTH_EAST,
EAST,
SOUTH_EAST,
SOUTH,
SOUTH_WEST,
WEST,
NORTH_WEST,
FINISHED
};
Direction incDir(Direction d)
{
switch(d)
{
case NORTH: return NORTH_EAST;
case NORTH_EAST: return EAST;
case EAST: return SOUTH_EAST;
case SOUTH_EAST: return SOUTH;
case SOUTH: return SOUTH_WEST;
case SOUTH_WEST: return WEST;
case WEST: return NORTH_WEST;
case NORTH_WEST: return NORTH;
default: return INVALID;
}
}
Direction operator-(Direction d)
{
switch(d)
{
case NORTH: return SOUTH;
case NORTH_EAST: return SOUTH_WEST;
case EAST: return WEST;
case SOUTH_EAST: return NORTH_WEST;
case SOUTH: return NORTH;
case SOUTH_WEST: return NORTH_EAST;
case WEST: return EAST;
case NORTH_WEST: return SOUTH_EAST;
default: return INVALID;
}
}
std::string toString(Direction d)
{
switch(d)
{
case NORTH: return "NORTH";
case NORTH_EAST: return "NORTH_EAST";
case EAST: return "EAST";
case SOUTH_EAST: return "SOUTH_EAST";
case SOUTH: return "SOUTH";
case SOUTH_WEST: return "SOUTH_WEST";
case WEST: return "WEST";
case NORTH_WEST: return "NORTH_WEST";
default: return "INVALID";
}
}
struct Cell
{
char l;
DicNode* node;
Direction dir, dirParent;
Cell(char l_ = 'A') : l(l_),node(0),dir(INVALID),dirParent(INVALID) {}
};
struct ProbLetter
{
char letter;
float proba;
};
class Map
{
public:
Map(unsigned int w, unsigned int h) : width(w), height(h)
{
data = new Cell[width * height];
for(unsigned int t = 0; t < width * height; t++)
data[t] = randomEnglishLetter();
}
static char randomEnglishLetter()
{
// Frequency of english letters
static ProbLetter probas[26] =
{
{ 'Z', 0.074f },
{ 'Q', 0.095f },
{ 'X', 0.150f },
{ 'J', 0.153f },
{ 'K', 0.772f },
{ 'V', 0.978f },
{ 'B', 1.492f },
{ 'P', 1.929f },
{ 'Y', 1.974f },
{ 'G', 2.015f },
{ 'F', 2.228f },
{ 'W', 2.360f },
{ 'M', 2.406f },
{ 'U', 2.758f },
{ 'C', 2.782f },
{ 'L', 4.025f },
{ 'D', 4.253f },
{ 'R', 5.987f },
{ 'H', 6.094f },
{ 'S', 6.327f },
{ 'N', 6.749f },
{ 'I', 6.966f },
{ 'O', 7.507f },
{ 'A', 8.167f },
{ 'T', 9.056f },
{ 'E', 12.702f }
};
// Random number between 0 and 1
float p = 100.0f * float(rand() % 10000) / 9999.0f;
float sum = 0.0f;
for(unsigned int t = 0; t < 26; t++)
{
sum += probas[t].proba;
if(p < sum) return probas[t].letter;
}
return probas[25].letter;
}
Cell& operator()(unsigned int x, unsigned int y)
{
return data[x + y * width];
}
bool inRange(int x, int y)
{
return x >= 0 && x < (int)width && y >= 0 && y < (int)height;
}
void print()
{
for(unsigned int y = 0; y < height; y++)
{
for(unsigned int x = 0; x < width; x++)
std::cout << " " << data[x + y * width].l;
std::cout << "\n";
}
}
unsigned int getWidth() const { return width; }
unsigned int getHeight() const { return height; }
private:
unsigned int width, height;
Cell* data;
};
// **************************************************** //
// //
// Word-search algorithm //
// //
// **************************************************** //
inline void advance(int& x, int& y, Direction d)
{
switch(d)
{
case NORTH:
y--;
return;
case NORTH_EAST:
x++;
y--;
return;
case EAST:
x++;
return;
case SOUTH_EAST:
x++;
y++;
return;
case SOUTH:
y++;
return;
case SOUTH_WEST:
x--;
y++;
return;
case WEST:
x--;
return;
case NORTH_WEST:
x--;
y--;
return;
default:
return;
}
}
struct Data
{
Map& map;
int x;
int y;
};
bool descend(Data& dat, unsigned int n)
{
DicNode* parent = dat.map(dat.x, dat.y).node;
if(n == parent->level) return false;
Direction dir = dat.map(dat.x, dat.y).dir;
if(dir == FINISHED) return false;
else if(dir == INVALID) dir = NORTH;
int pX = dat.x, pY = dat.y;
bool firstIteration = true;
for(; firstIteration || dir != NORTH; dir = incDir(dir))
{
firstIteration = false;
pX = dat.x;
pY = dat.y;
advance(pX, pY, dir);
if(dat.map.inRange(pX, pY) // (pX, pY) is not outside the map
&& dat.map(pX, pY).node == 0 // The cell is not already used
&& parent->at(dat.map(pX, pY).l) != 0) // An entry in the dictionary exists
{
// We found the next node
dat.map(dat.x, dat.y).dir = dir;
dat.map(pX, pY).node = parent->at(dat.map(pX, pY).l);
dat.map(pX, pY).dirParent = -dir;
dat.x = pX;
dat.y = pY;
return true;
}
}
return false;
}
bool ascend(Data& dat)
{
// Go back on the previous node
Direction dp = dat.map(dat.x, dat.y).dirParent;
if(dp == INVALID) return false;
dat.map(dat.x, dat.y).dir = INVALID;
dat.map(dat.x, dat.y).dirParent = INVALID;
dat.map(dat.x, dat.y).node = 0;
advance(dat.x, dat.y, dp);
dat.map(dat.x, dat.y).dir = incDir(dat.map(dat.x, dat.y).dir);
if(dat.map(dat.x, dat.y).dir == NORTH)
dat.map(dat.x, dat.y).dir = FINISHED;
return true;
}
void findWordsFromPosition(Map& map, DicNode* parent, unsigned int x, unsigned int y, unsigned int n, std::vector<std::string>& output)
{
if(!parent) return;
bool ok = true;
// Setup first node
map(x, y).node = parent;
map(x, y).dir = NORTH;
// while we can find next node with direction
// If marked node has right order and is end of word, or if condition on n is not verified:
// add it to the output (if condition on n is verified)
// no need to go further (because of n), so advance direction of parent cell, reset current cell, and go up for the current position
Data dat = { map, x, y };
while(ok)
{
if(map(dat.x,dat.y).node->toString() == "ane")
std::cout << "ok\n";
if(!descend(dat, n))
ok = ascend(dat);
else
{
DicNode* node = dat.map(dat.x, dat.y).node;
if(node->level == n && node->isEndOfWord)
{
std::string str = node->toString();
if(std::find(output.begin(), output.end(), str) == output.end())
output.push_back(str);
ok = ascend(dat);
}
else if(!node->inRange(n - node->level))
ok = ascend(dat);
}
}
// Clean first node
map(x, y).dir = INVALID;
map(x, y).dirParent = INVALID;
map(x, y).node = 0;
}
void getWords(Map& map, DicNode& dic, unsigned int n, std::vector<std::string>& output)
{
if(n > dic.max || n < dic.min) return;
// Search words for every position
for(unsigned int y = 0; y < map.getHeight(); y++)
for(unsigned int x = 0; x < map.getWidth(); x++)
findWordsFromPosition(map,dic.at(map(x,y).l),x,y,n,output);
}
#include <fstream>
int main()
{
srand((unsigned int)time(0));
// Prepare the words, for example, load them from a real dictionary
DicNode dictionary(0);
unsigned int maxSize = 0;
// Note: the following code make sense only if you load words from a file
{
std::ifstream dico("english.txt");
std::string str;
int count = 0;
while(dico >> str)
{
if(str.size() > maxSize) maxSize = str.size();
addWord(&dictionary,str);
count++;
}
std::cout << "Read " << count << " words from dictionary, longest with " << maxSize << " letters.\n";
}
// Prepare the matrix
Map mat(4,4);
/*
mat(0,0) = 'A';
mat(1,0) = 'N';
mat(2,0) = 'F';
mat(3,0) = 'N';
mat(0,1) = 'S';
mat(1,1) = 'L';
mat(2,1) = 'E';
mat(3,1) = 'R';
mat(0,2) = 'G';
mat(1,2) = 'O';
mat(2,2) = 'R';
mat(3,2) = 'R';
mat(0,3) = 'S';
mat(1,3) = 'I';
mat(2,3) = 'U';
mat(3,3) = 'I';
*/
std::cout << "\nMatrix:\n";
mat.print();
// Search words
std::vector<std::string> words;
for(unsigned int t = 5; t <= 7; t++)
getWords(mat,dictionary,t,words);
std::cout << "\nWords:\n";
if(words.size() == 0)
std::cout << " (no words)\n";
else
for(unsigned int t = 0; t < words.size(); t++)
std::cout << " " << words[t] << "\n";
}
Note: the previous code was a code-search resolver. It now removed from the answer, but you can still get it by searching in the edit revisions of this answer.
EDIT2: Some verbal explanations
The dictionary. It's a kind of a radix-tree, but each of the node have only one letter to store. It can have up to 26 children (more if you include accents and/or digits), one for each letter of the considered alphabet. The basic layout is the following:
Word tree http://img571.imageshack.us/img571/2281/wtree.png
Example:
An example of a dictionary http://img706.imageshack.us/img706/1244/wordsr.png
Each node also contains a boolean which indicates whether it's the end of a word. Additionnally, each node store the minimum and maximum size of its children branchs, its parent, and its level from the root (= size of the prefix). It allows to stop the search of words when we see that no children branch can give a word with a specific length.
The matrix. The matrix is a double array of letters. But, for more efficiency, I added some data: its corresponding node in the dictionary (see algorithm), the direction to its children (see algorithm) and to its parent (see algorithm).
The algorithm. The idea is to 'circle' in the map, by 'descending' (increasing the path representing the current found string). When descending is not possible, we have to 'ascend':
FIND WORD:
----------
while(we can continue)
try descend
if(failed to descend)
ascend
else
node = node of current cell
if node is end of word
add it to the found words, then ascend
else if node is out of range (ie, n = 5, but with node you can only have words with 3 characters)
ascend
DESCEND:
--------
we're from cell (x,y), with node D
for each direction (from cell.direction to north-west, clockwise)
advance position (x,y) with direction
if the position is valid
and we haven't already visited the cell for this word
and the node got with D.children[cell.letter] exists
set the current position of the algorithm to the result of the advance
set the direction of the old cell to the direction used to advance
set the direction of the new cell to the opposite (pointing to the old cell)
set the node of the new cell by the found one
return
ASCEND:
-------
we're at (x,y)
cell.node = 0 (the cell is not considered as visited anymore)
advance the parent direction by one clockwise (if this direction is north-west, we use a special direction that say that we cannot rotate anymore: FINISHED)
cell.dir = INVALID
advance (x,y) following cell.direction_to_parent
cell.direction_to_parent = INVALID
set the current position to the result of advance
An image explanation:
Algorithm unrolled http://img822.imageshack.us/img822/1374/algow.png
Steps (number indicated in the losanges):