UPD: I moved original question to https://codereview.stackexchange.com/questions/127055/building-tree-graph-from-dictionary-performance-issues
Here is a short version, without codes.
I'm trying to build a prefix tree from dictionary. So, using the following dictionary 'and','anna','ape','apple'
, graph should look like this:
I've tried 2 approaches: using associative array and using self-written tree/node classes.
Note: original dictionary is something about 8 MB and contains >600000 words.
Question: is there any good(fast/efficient) way to do it?
I've tried so far:
php associative arrays (they are not very flexible for future work with this graph).
self-written Tree/Node classes (performance issues - execution time rises by up to 7x, memory usage rises by 2x even without implementing anything except just inserting
function).
Sample codes are available on codereview (the very first link in question)
As long as I've switched to C++ and got a good answer on codereview, I'll just answer my own question here.
There is one more way to do it way more time-efficient by increasing memory usage(it's not really big increase, compared to "array
of array
s of array
s..." approach). The approach is called "double array trie" and you can read info on this topic here and read the aforementioned answer on codereview to see an example of implementation.
It's more time-efficient, yet it allows less flexibility/convenience for future trie use (compared to OOP approach).
So the final answer on this question for me is: "php is not the best tool to work with really big tries with".