Search code examples
objective-cxmlnsstringnsscanner

Find number nodes in NSString and add underscore before it to fix invalid XML


I have an XML string that gets returned to me, sometimes it is invalid using numbers as node names such as: <2>. I would like to scan my entire NSString which holds the XML, and search for the following:

<numeric value // e.g. <1  or <2

</numeric value // e.g. </1 or </2

I would then like to place an underscore before the number, so that it will change the invalid, to valid XML, like the following:

<_2>
</_2>

I am wondering is NSScanner would do the job, but I am unsure how to attack this problem. Right now I am just using stringByReplacingOccurrencesOfString:withString: but I am having to hardcode in the number values to replace, which I don't think is a good idea.

UPDATE:

I gave it a try and used NSRange. Here is what I came up with. It is working about 95%, but on large xml strings it misses the last few </ > tags, not sure why. Any comments or help on improving this?

// Changeable string
NSMutableString *editable = [[[NSMutableString alloc] initWithString:str] autorelease];

// Number Formatter
NSLocale *l_en = [[[NSLocale alloc] initWithLocaleIdentifier: @"en_US"] autorelease];
NSNumberFormatter *f = [[[NSNumberFormatter alloc] init] autorelease];
[f setLocale: l_en];

// Make our first loop
NSUInteger count = 0, length = [str length];
NSRange range = NSMakeRange(0, length); 
while(range.location != NSNotFound) {

    // Find first character
    range = [str rangeOfString: @"<" options:0 range:range];

    // Make sure we have not gone too far
    if (range.location+1 <= length) {

        // Check the digit after this
        NSString *after = [NSString stringWithFormat:@"%c", [str characterAtIndex:range.location+1]];

        // Check if we return the number or not
        if ([f numberFromString:after]) {

            // Update the string
            [editable insertString:@"_" atIndex:(range.location+1)+count];
            count++;

        }//end

    }//end

    // Check our range
    if(range.location != NSNotFound) {
        range = NSMakeRange(range.location + range.length, length - (range.location + range.length));
    }//end

}//end

// Our second part
NSUInteger slashLength = [editable length];
NSRange slashRange = NSMakeRange(0, slashLength); 
while(slashRange.location != NSNotFound) {

    // Find first character
    slashRange = [editable rangeOfString: @"</" options:0 range:slashRange];

    // Make sure we have not gone too far
    if (slashRange.location+2 <= slashLength) {

        // Check the digit after this
        NSString *afterSlash = [NSString stringWithFormat:@"%c", [editable characterAtIndex:slashRange.location+2]];

        // Check if we return the number or not
        if ([f numberFromString:afterSlash]) {

            // Update the string
            [editable insertString:@"_" atIndex:(slashRange.location+2)];

        }//end

    }//end

    // Check our range
    if(slashRange.location != NSNotFound) {
        slashRange = NSMakeRange(slashRange.location + slashRange.length, slashLength - ((slashRange.location+2) + slashRange.length));
    }//end

}//end

NSLog(@"%@", editable);

Solution

  • I ended up figuring out a solution. Here is the method that I used:

    - (NSString *)insertUnderscoreInString:(NSString *)fullString afterString:(NSString *)afterString {
    
        // Changeable string
        NSMutableString *editable = [[[NSMutableString alloc] initWithString:fullString] autorelease];
    
        // Number Formatter
        NSLocale *l_en = [[NSLocale alloc] initWithLocaleIdentifier: @"en_US"];
        NSNumberFormatter *f = [[[NSNumberFormatter alloc] init] autorelease];
        [f setLocale: l_en];
        [l_en release];
    
        // Make our loop
        NSUInteger count = 0, length = [fullString length];
        NSRange range = NSMakeRange(0, length); 
        while(range.location != NSNotFound) {
    
            // Find first character
            range = [fullString rangeOfString:afterString options:0 range:range];
    
            // Make sure we have not gone too far
            if (range.location+1 <= length) {
    
                // Check the digit after this
                NSString *after = [NSString stringWithFormat:@"%c", [fullString characterAtIndex:range.location+afterString.length]];
    
                // Check if we return the number or not
                if ([f numberFromString:after]) {
    
                    // Update the string
                    [editable insertString:@"_" atIndex:(range.location+afterString.length)+count];
                    count++;
    
                }//end
    
            }//end
    
            // Check our range
            if(range.location != NSNotFound) {
                range = NSMakeRange(range.location + range.length, length - (range.location + range.length));
            }//end
    
        }//end
    
        return editable;
    
    }//end
    

    So, then you could test it using:

    NSString *val = [self insertUnderscoreInString:str afterString:@"<"];
    NSString *val2 = [self insertUnderscoreInString:val afterString:@"</"];
    NSLog(@"%@", val2);