Search code examples
objective-cnsstringnsarraystring-comparison

Comparing string arrays


Sorry in advance if this is a stupid question. I'm working on a simple program that compares two arrays filled with strings. One is a list of 1309 proper names the other is a list of 235,877 english words. The point of the program is to compare the lists, and have any words that appear on both lists added to a mutable array. Then, the program will enumerate through the mutable array and print out the words that are on both lists. Here is my code:

    #import <Foundation/Foundation.h>

int main(int argc, const char * argv[]) {
    @autoreleasepool {

        NSString *nameString = [NSString stringWithContentsOfFile:@"/usr/share/dict/propernames"
                                  encoding:NSUTF8StringEncoding
                                     error:NULL];
        NSString *wordString = [NSString stringWithContentsOfFile:@"/usr/share/dict/words"
                                                         encoding:NSUTF8StringEncoding
                                                            error:NULL];

        NSArray *names = [nameString componentsSeparatedByString:@"\n"];
        NSArray *words = [wordString componentsSeparatedByString:@"\n"];

        NSMutableArray *namesAndWords = [[NSMutableArray alloc]init];

        for (NSString *w in words){
            for (NSString *n in names){
                if ([[n lowercaseString] compare:w] == NSEqualToComparison){
                    [namesAndWords addObject: w];}}}

        for (NSString *item in namesAndWords){
            NSLog(@"%@", item);}


        NSLog(@"There are %lu items in the array",[namesAndWords count]);
        NSLog(@"%lu", [names count]);
        NSLog(@"%lu", [words count]);
    }
    return 0;
}

As of right now, I've got this program working exactly as it should (showing 294 matches). My real question is when I first tried comparing the strings I tried it like this:

for (NSString *w in words){
            for (NSString *n in names){
                if ([n caseInsensitiveCompare:w] == NSEqualToComparison){
                    [namesAndWords addObject: w];}}}

and like this:

for (NSString *w in words){
        for (NSString *n in names){
            if ([n compare:w options:NSCaseInsensitiveSearch] == NSOrderedSame){
                [namesAndWords addObject: w];}}}

These two ways both gave me 1602 matches and for some reason adds some items from both arrays into the mutable array namesAndWords. So for example in the console I will see Woody and woody printed out.

The other way I tried was this:

    for (NSString *w in words){
        for (NSString *n in names){
            if ([n compare:w] == NSOrderedSame){
                [namesAndWords addObject: w];}}}

When doing it this way it added all 1309 strings from the names array. Before running this I actually thought I wouldn't get any matches since I didn't specify it to be case insensitive.

I'm trying to figure out why these methods that seem so similar have the different results that they do. I'm also trying to find out why if ([[n lowerCaseString] compare:w] == NSEqualToComparison) is the right way to go. Any help here is greatly appreciated.


Solution

  • Because the below line checks the word only converts lowercase string of the first array and not the second one. It gets only matching value like m->m including duplicates.

    [[n lowercaseString] compare:w] == NSEqualToComparison
    

    Below is my workout for your problem.

    NSMutableArray *actualarray1=[[NSMutableArray alloc] init];
    NSMutableArray *actualarray2=[[NSMutableArray alloc] init];
    actualarray1=[@[@"Apple",@"Litchi",@"Plum",@"Litchi",@"Pineapple",@"mango",@"Apple",@"berry",@"Pineapple",@"berry",@"mango",@"Apple"]mutableCopy];
    actualarray2=[@[@"guava",@"Orange",@"Litchi",@"Pineapples",@"mangoes",@"Orange",@"Strawberry",@"Pineapple",@"berry",@"mango",@"Apple"]mutableCopy];
    NSMutableArray *namesAndWords = [[NSMutableArray alloc]init];
    for (NSString *w in actualarray1){
        for (NSString *n in actualarray2){
            if ([[n lowercaseString] compare:w] == NSEqualToComparison){
                [namesAndWords addObject: w];}}}
    NSLog(@"Array without duplicates %d",(int)[namesAndWords count]);
    namesAndWords=[[NSMutableArray alloc] init];
    for (NSString *w in actualarray1){
        for (NSString *n in actualarray2){
            if ([n compare:w options:NSCaseInsensitiveSearch] == NSOrderedSame){
                [namesAndWords addObject: w];}}}
    NSLog(@"Array with duplicates %d",(int)[namesAndWords count]);
    namesAndWords=[[NSMutableArray alloc] init];
    for (NSString *w in actualarray1){
        for (NSString *n in actualarray2){
            if ( [n caseInsensitiveCompare:w] == NSOrderedSame ){
                [namesAndWords addObject: w];}}}
    NSLog(@"Array with duplicates %d",(int)[namesAndWords count]);
    

    In the above code, array 1 has duplicates on itself and array 2 as well. Please try some manual iterations and it is just because of the last two comparison ends up with one-to-many mapping. Last two methods which produces duplicates on your case is just because, you're using for each loop and checking all the values in the array. What will be the result if you remove the duplicates in the array before comparing? Let's have a look at below code.

     NSOrderedSet *orderedSet = [NSOrderedSet orderedSetWithArray:actualarray1];
            NSArray *arrayWithoutDuplicates = [orderedSet array];
            actualarray1=[arrayWithoutDuplicates mutableCopy];
            orderedSet = [NSOrderedSet orderedSetWithArray:actualarray2];
            arrayWithoutDuplicates = [orderedSet array];
            actualarray2=[arrayWithoutDuplicates mutableCopy];
            NSLog(@"%@ %@",actualarray1,actualarray2);
            namesAndWords=[[NSMutableArray alloc] init];
            for (NSString *w in actualarray1){
                for (NSString *n in actualarray2){
                    if ( [n caseInsensitiveCompare:w] == NSOrderedSame ){
                        [namesAndWords addObject: w];}}}
            //Your code works like a charm!
             NSLog(@"After removing duplicates %d",(int)[namesAndWords count]);
    
    
           namesAndWords=[[NSMutableArray alloc] init];
            for (NSString *s in actualarray1){
                if([actualarray2 containsObject:s]){
                    [namesAndWords addObject: s];
                }
            }
            //This is my code which eventually reduces time
             NSLog(@"Count after unique %d",(int)[namesAndWords count]);
    

    I'd suggest you to not to use comparison like [[n lowercaseString] compare:w] == NSEqualToComparison which has incorrect logic. Because, you're converting only one object from the array to lowercase and the logic is incorrect. Because, it gets only lowercase data which is matching in the above code. Instead, you can use [n caseInsensitiveCompare:w] == NSOrderedSame if you need values with unique or remove duplicates before comparing. Also, it is not advisable to use fast iteration in this scenario since performance may be degraded if the array is too large.

    Hope it clears your doubt!