Search code examples
objective-ciosxml-parsinghtml-parsingtbxml

TBXML textForElement issue


I'm trying to parser an HTML with the TBXML library for iOS and I want to obtain the "more text" value for this piece of HTML:

<div> 
 <a href="/url/1">
   <strong>value</strong>
 </a>
 more text
</div>

I have used this code, but it does not seems to work:

//Assume that "div" is a TBXMLElement* for this div 
NSString* content = [TBXML textForElement:div];
//Returns @"" when the value @"more text" is expected...

What's wrong in my code?


Solution

  • Ok, I have modified the TBXML Library and I have resolved the problem...If someone has the same issue, try this:

    1) Create an attribute in the file TBXML.h for TBXMLElement with name NSString* afterText.

    2) Search this code in the file TBXML.m and comment it:

    // if parent element has children clear text
                    if (parentXMLElement && parentXMLElement->firstChild)
                        parentXMLElement->text = 0;
    

    3) Write this code before the code of the step 1:

    if (parentXMLElement && parentXMLElement->firstChild){
    
                //if the next string does not content...
                const char * parentNametag =  [[TBXML elementName:parentXMLElement] UTF8String];
                char * finalTag =  (char *)malloc(sizeof("</")+sizeof(parentNametag)+sizeof(">"));
                strcpy(finalTag,"</");
                strcat(finalTag,parentNametag);
                strcat(finalTag,">");
    
                char * elementTextStart = elementStart;//parentXMLElement->text;
                char * elementTextEnd = elementTextStart;
    
                elementTextEnd = strstr(elementStart,finalTag);
    
                if(elementTextEnd != NULL){
                    long textLength = strlen(elementTextStart) - strlen(elementTextEnd) ;
                    if (textLength > 0){
                        afterTextStart = (char *)malloc(textLength*sizeof(char));
                        memcpy(afterTextStart, elementTextStart,(textLength*sizeof(char)));
                        parentXMLElement->afterText = afterTextStart;
                    }
                }
    
            }
    

    Now the attribute "after text" contains "more text".

    It's not an orthodox solution but it's works for me.