Search code examples
c++boggle

Optimal way to introduce data in a Trie? (C++)


We are working on a program to solve a Boggle game, and the whole program must be executed in less than 0.1s. We insert a dictionary into our code by standard input (which lasts 0.05s, the half of our maximum time). We use the following functions to add the words into the dictionary, which last 0.1 seconds each:

void Node::addWord(const char* word){
  const char idx = *word - 'a';
  if(!mChildren[idx]) mChildren[idx] = new Node(*word);
  if(strlen(word) > 1) mChildren[idx]->addWord(word + 1);
  else mChildren[idx]->setMarker(true);
}

void Trie::addDictionary(const vector<string>* aux){
  auto length = aux->size();
  Node* current = root;
  for (size_t i = 0; i < length; i++) {
    int len = aux->at(i).length();
    if (len >= 3 && len <= DIM*DIM + 1) {
        current->addWord(aux->at(i).c_str());
    }
  }
}

In this case, DIM = 4 (is the dimension of the NxN board of the Boggle) and aux is a vector where all the data from the standard input is dumped. We impose the condition of len >= 3 because we only want words with 3 letters and more.

Any ideas to improve the speed of these functions?


Solution

  • The best way to improve the speed of these function is to run a profiler on them. I would not consider optimizing code like this until you run a profiler against it.

    That being said, my prediction will be that, if you run the profiler, you will find that a large portion of your runtime (like 90%) will be in new Node(*word). Heap allocations are slow compared to the other operations you are doing. How slow? It depends on your platform, which is why you have to profile to find hotspots. What consumes large amounts of time on one platform may consume very little time on others.

    Run a profiler, determine if my claim is correct. If it is correct, then I recommend pool allocating Nodes to decrease the number of allocations that have to occur.