Search code examples
iosiphoneobjective-clinguistics

Word Stemming in iOS - Not working for single word


I am using NSLinguisticTagger for word stemming. I am able to get a stem words of words in a sentence, but not able to get a stem word for a single word.

Following is the code I am using,

    NSString *stmnt = @"i waited";
    NSLinguisticTaggerOptions options = NSLinguisticTaggerOmitWhitespace | NSLinguisticTaggerOmitPunctuation | NSLinguisticTaggerJoinNames;

    NSLinguisticTagger *tagger = [[NSLinguisticTagger alloc] initWithTagSchemes:@[NSLinguisticTagSchemeLemma] options:options];
    tagger.string = stmnt;
    [tagger enumerateTagsInRange:NSMakeRange(0, [stmnt length]) scheme:NSLinguisticTagSchemeLemma options:options usingBlock:^(NSString *tag, NSRange tokenRange, NSRange sentenceRange, BOOL *stop) {
        NSString *token = [stmnt substringWithRange:tokenRange];
        NSLog(@"%@: %@", token, tag);
    }];

For this I am getting out correctly as:

i: i
waited: wait

But the above code fails to identify stem word if stmnt = @"waited";

Any help is greatly appreciated


Solution

  • Following code worked for me,

    NSString *stmt = @"waited";
    NSRange stringRange = NSMakeRange(0, stmt.length);
    NSDictionary* languageMap = @{@"Latn" : @[@"en"]};
    [stmt enumerateLinguisticTagsInRange:stringRange
                                           scheme:NSLinguisticTagSchemeLemma
                                          options:NSLinguisticTaggerOmitWhitespace
                                      orthography:[NSOrthography orthographyWithDominantScript:@"Latn" languageMap:languageMap]
                                       usingBlock:^(NSString *tag, NSRange tokenRange, NSRange sentenceRange, BOOL *stop) {
                                           // Log info to console for debugging purposes
                                           NSString *currentEntity = [stmt substringWithRange:tokenRange];
                                           NSLog(@"%@ is a %@, tokenRange (%d,%d)",currentEntity,tag,tokenRange.length,tokenRange.location);
                                       }];