Search code examples
c#rubyindexingstemming

Index Stemming to process text in C# or ruby


Given this text:

"Friends are friendlier friendlies that are friendly and classify the friendly classification class. Flowery flowers flow through following the flower flows"

I need to apply stemming to the text to achieve the following outcome:

frequency("following")                = 1
frequency("flow")                     = 2
frequency("classification")           = 1
frequency("class")                    = 1
frequency("flower")                   = 3
frequency("friend")                   = 4
frequency("friendly")                 = 4
frequency("classes")                  = 1

As we interface with the FAST search engine. FAST indexes content to provide relevant search results to a query. One aspect of indexing is stemming and we need to use either C# or ruby to solve this.

Would appreciate anyone's views on the best approach


Solution

  •     public StemmingProcessorResults ProcessText(string text)
        {
                return new StemmingProcessorResults(
                        new []{
                            new StemmingProcessorResultItem("following", 1),
                            new StemmingProcessorResultItem("flow", 2),
                            new StemmingProcessorResultItem("classification", 1),
                            new StemmingProcessorResultItem("class", 1),
                            new StemmingProcessorResultItem("flower", 3),
                            new StemmingProcessorResultItem("friend", 4),
                            new StemmingProcessorResultItem("friendly", 4),
                            new StemmingProcessorResultItem("classes", 1)
                        }
                    );
        }
    

    There you go, that should be perfect for your copy-paste needs