Search code examples
phpsuffix-tree

Can someone explain when and how to extend a suffix tree?


I'm working on a php script which has to find the longest repeated substring. I found this Suffix-Tree thing. I'm trying to implement Ukkonnen's algorithm, but I can't get when and how to extend the tree.

It's okay if i have new charachter which is not in the tree yet I have to create a new node and egde from the root for it. But how should I know if I have to split up an edge?

I found a C++ implementation of it (link) and I tried to translate it to php but I think I've got a typeo in it because it gives an almost good result, the problem is I can't fix it until I do not understand it fully...

I read a dozen of descriptions of Suffix-Trees but some of them do not go too deep in it, others give me headache after the second sentece.

Here is the code what I have now: Suffix-tree.php (Sorry but this editor couldn't take it) I used this site to check the result.

So any advise would be appreciated...

EDIT: I rewrote it from the JavaScript stuff found on the mentioned site. Here is the link to the source: Suffix-Tree v0.1


Solution

  • A good explanation is given by Matt Mahoney, a data compression expert. But I didn't understand the implementation too, it's quite difficult. FYI I've managed to run a suffix-tree php extension. You can find my code at sourceforge if it helps. I would love to see your final code though!