I'm looking to write an efficient n-order Markov chain method to generate random text strings given a set of example text. I currently have a Java implementation that uses several layers of Maps, but it's clunky. A suffix array would be perfect for my needs, but I'm not clear if that can be efficiently implemented in Java.
In C I might do something like:
char exampleText[MAX];
char *suffixArray[MAX];
...
while(n<MAX && suffixArray[n++] = &exampleText[n]);
sort(suffixArray);
This gets gnarly in Java since I'd have to take substrings of exampleText
, or turn suffixArray
into an array of indices, or something else.
Any suggestions for a good approach to this in Java?
String
will [typically] do that for you. (Typical implementations share backing arrays when created with substring
, although that is subject to change at any time.)