Search code examples
phpregexpreg-matchdomdocumenthtml-content-extraction

extracting information from html using php


i have a page containing this scope of information. I'm searching a way to extract the piece of information from 'exchange with'. as you see it folds with a section tag without any property an the infomation 'xyzem' which i need have no special property. i use DOMDocument and loadHTMLFile. I will be glad if you guide me....

<div class="divPrev">
<section>
    <label>price </label>
    <div class="divPrevTxt">50000$</div>
</section>    
    <section>
        <label>information</label>
        <div class="divPrevTxt">a,b,c</div>
    </section>    
    <section>
        <label>supors</label>
        <div class="divPrevTxt">som info</div>
    </section>
    <section>
        <label>documents</label>
        <div class="divPrevTxt">x,y,z</div>
    </section>                                               
    <section>

//*************************NOTICE       
    <section>                
        <label>
            exchange with
        </label>
        <div class="divPrevTxt">
            xyzem //I NEED THIS PIECE 
        </div>
    </section>

//*************************END NOTICE       


    <section>
        <label>address</label>
        <div id="divAddress" class="divPrevTxt">mon-mphho-33000</div>
    </section>
    <section>
        <label>contact</label>
        <div class="divPrevTxt" id="contactInfo">
            <span style="color:#DE9C26"></span>
                88-8888-999,9987-9989-88a
        </div>
    </section>           


Solution

  • I would use regular expression for this:

    preg_match('#<label>(?:\s)*exchange with(?:\s)*</label>(?:\s)*<div class="divPrevTxt">(?:\s)*(.*)(?:\s)*</div>#i',$content, $matches);
    
    echo $matches[1];
    

    that for your content returns:

    xyzem //I NEED THIS PIECE