Search code examples
iphoneiosxmlxcodexcode4

NSXMLParser stops parsing after encountering special character


I am reading a XML file from google weather api and parsing it using NSXMLParser. The city in question is Paris. Here is a brief xml output I get

           <?xml version="1.0"?>
    <xml_api_reply version="1">
    <weather module_id="0" tab_id="0" mobile_row="0" mobile_zipped="1" row="0" section="0" ><forecast_information>
    <city data="Paris, Île-de-France"/>
    <postal_code data="Paris"/>
    <latitude_e6 data=""/>
    <longitude_e6 data=""/> 
...
...

Now the code I used to pares this xml is

NSString *address = @"http://www.google.com/ig/api?weather=Paris";
    NSURL *URL = [NSURL URLWithString:address];

NSXMLParser *parser = [[NSXMLParser alloc] initWithContentsOfURL:URL];
    [parser setDelegate:self];
    [parser parse];
...

- (void)parser:(NSXMLParser *)parser didStartElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qualifiedName attributes:(NSDictionary *)attributeDict 
{

    NSLog(@"XML Parser 1 ... elementName ... %@", elementName);

}

This is output that I get for the above xml

XML Parser 1 ... elementName ... xml_api_reply
XML Parser 1 ... elementName ... weather
XML Parser 1 ... elementName ... forecast_information

The problem is that it parses all the tags till it reaches "city data" since there is a non-ascii character in the name Paris, Île-de-France and then it just stops. It doesn't process tags afterwards like postal_code. latitude, longitude etc.

So my question is, is there a way I can remove all non-ascii characters from the returned URL XML string?


Solution

  • Ok. I have solved this problem. This is how I got it to work.

    First I do is get the XML from the URL with special characters. Then I strip out all the special characters from the XML string. Then I convert the string to NSdata and then pass that nsdata object to my NSXMLParser. Since it has no more special characters NSXMLParser is happy.

    Here's the code for anyone who may run across in future. Big thank you to everyone who contributed to this post!

    NSString *address = @"http://www.google.com/ig/api?weather=Paris";
        NSURL *URL = [NSURL URLWithString:address];
        NSError *error;    
        NSString *XML = [NSString stringWithContentsOfURL:URL encoding:NSASCIIStringEncoding error:&error];
    
        //REMOVE ALL NON-ASCII CHARACTERS
             NSMutableString *asciiCharacters = [NSMutableString string];
             for (NSInteger i = 32; i < 127; i++)  
             {
             [asciiCharacters appendFormat:@"%c", i];
             }
    
             NSCharacterSet *nonAsciiCharacterSet = [[NSCharacterSet characterSetWithCharactersInString:asciiCharacters] invertedSet];
    
             XML = [[XML componentsSeparatedByCharactersInSet:nonAsciiCharacterSet] componentsJoinedByString:@""];
    
        NSData *data = [XML dataUsingEncoding:NSUTF8StringEncoding];
        NSXMLParser *parser = [[NSXMLParser alloc] initWithData:data];
        [parser setDelegate:self];
        [parser parse];
    

    EDIT:

    NSXMLParser is a horrible tool. I have successfully used RaptureXML in all my apps. Its super easy to use and avoids all this non-sense of non-ascii characters. https://github.com/ZaBlanc/RaptureXML