Search code examples
phphtmlparsingweb-scrapingsimple-html-dom

Find div with class and it's plain-text using PHP Simple HTML DOM Parser


i want to find class ft00 between Work Experience and EDUCATION AND TRAINING and extract class text which contains dates from the given html

<p class = "ft00">Introduction</p>
<p class = "ft00">John Smith</p>
<p class = "ft02">Email:</p>
<p class = "ft00">[email protected]</p>
<p class = "ft00">Work Experience</p>
<p class = "ft00">27 July 2017</p>
<p class = "ft02">ABC Company</p>
<p class = "ft00">19 May 2018</p>
<p class ="ft02">XYZ Company</p>
<p class = "ft00">EDUCATION AND TRAINING</p>

so far i could get is to extract all data between Work Experience and EDUCATION AND TRAINING and it's working properly and the code is given below:-

$fexp = $html->find('p[plaintext^=Work Experience]');
$items = array();
 foreach ($fexp as $keye) {

    while ( $keye->nextSibling() ) {
        if ( $keye->nextSibling() == TRUE ) {

         $keye = $keye->nextSibling();
            $varce = $keye->plaintext;



        }
        if ( trim($varce) == "EDUCATION AND TRAINING" ){
            break;
        }
        //$test[] = $collection;
       $items[] = $varce;
        // echo $varce;

}
}
var_dump($items);

i am close but can't seem to find out the solution, any help would be appreciated thanks :-)


Solution

  • Here is the proper working code:-

    $test = array();
    $matching  = false;
    $collection = $html->find('p.ft00');
    foreach ($collection as $tkey) {
        if ($tkey->plaintext == "WORK EXPERIENCE" || $matching ) {
            $test[] = $tkey->plaintext;
            $matching = true;
        }
        if ( $tkey->plaintext == "EDUCATION AND TRAINING") {
            break;
        }
    
        }
        var_dump($test);    
    

    Output:-

    Array
    (
        [0] => Work Experience
        [1] => 27 July 2017
        [2] => 19 May 2018
        [3] => EDUCATION AND TRAINING
    )