Search code examples
phpdomsimple-html-dom

How to get text after tag


I am using Simple HTML DOM Parser php

I don't understand how to get text after tags e.g( <b></b> Text ).

Please see following image. I am hitting this website and get this HTML

Image Link

I want to create array() like this from above image details:

array(

     'release_year'=> 2009,
     'genre'       => 'Drama,Fantasy,Horror',
     'description' => 'etc etc etc',
     'imdb'        => 'link of imdb',
     'total_episode'=> '28 episode',
     'latest_episode_title'=> 'title',
     'latest_episode_link' => 'link',
     'latest_episode_with_link_title'=> 'title',
     'latest_episode_with_link_link' => 'link',
);

I am successfully getting text under tags <b></b> but I don't know how to get text after <b> tags shown in HTML. please review it and my PHP code and result also, and please kindly solve my problem. I am very thankful to you in advance.

Here is HTML of above picture:

<div class="show-summary">
    <table border="0" style="padding:3px">
        <tbody>
            <tr>
                <td style="padding:3px">
                    <a href="/serie/the_vampire_diaries">
                        <img src="http://static1.watchseries.ag/90/1/The_Vampire_Diaries-18597.JPEG" alt="Watch Series - The Vampire Diaries" title="Watch Series - The Vampire Diaries" height="120px" width="85px">
                    </a>
                </td>

                <td valign="top" style="padding:3px">
                    <p>
                        <b>Release Year: </b>
                        2009<br>

                        <b>Genre: <a href="/genres/Drama">Drama</a>, <a href="/genres/Fantasy">Fantasy</a>, <a href="/genres/Horror">Horror</a></b>

                        <br>

                        <b>External Links: </b>
                        <a href="http://www.imdb.com/title/tt1405406/" target="_blank">IMDB</a>

                        <br>

                        <b>No. of episodes: </b> 
                        128 episodes <br>

                        <b>Latest Episode: </b> 
                        <a title="Watch The Vampire Diaries Latest Episode (The Vampire Diaries Season 6 Episode 16)" href="/episode/the_vampire_diaries_s6_e16.html">Season 6 Episode 16 The Downward Spiral (26/02/2015)</a>

                        <br>

                        <b>Latest Episode With Links: </b> 
                        <a title="Watch The Vampire Diaries Latest Episode (The Vampire Diaries Season 6 Episode 11)" href="/episode/the_vampire_diaries_s6_e11.html">Season 6 Episode 11 Woke Up With a Monster (22/01/2015)</a>

                        <br>

                    </p>

                    <div style="float: left; height: 30px; overflow: hidden; width: 100px;">
                        <div class="fb-like fb_iframe_widget" data-href="http://watchseries.ag/serie/the_vampire_diaries" data-send="false" data-layout="button_count" data-show-faces="false" fb-xfbml-state="rendered" fb-iframe-plugin-query="app_id=434603673340441&amp;href=http%3A%2F%2Fwatchseries.ag%2Fserie%2Fthe_vampire_diaries&amp;layout=button_count&amp;locale=en_US&amp;sdk=joey&amp;send=false&amp;show_faces=false">
                            <span style="vertical-align: bottom; width: 79px; height: 20px;">
                                <iframe name="fbc5b3f58" width="1000px" height="1000px" frameborder="0" allowtransparency="true" scrolling="no" title="fb:like Facebook Social Plugin" src="http://www.facebook.com/plugins/like.php?app_id=434603673340441&amp;channel=http%3A%2F%2Fstatic.ak.facebook.com%2Fconnect%2Fxd_arbiter%2F7r8gQb8MIqE.js%3Fversion%3D41%23cb%3Df314058a5%26domain%3Dwatchseries.ag%26origin%3Dhttp%253A%252F%252Fwatchseries.ag%252Ff5fff1c%26relation%3Dparent.parent&amp;href=http%3A%2F%2Fwatchseries.ag%2Fserie%2Fthe_vampire_diaries&amp;layout=button_count&amp;locale=en_US&amp;sdk=joey&amp;send=false&amp;show_faces=false" style="border: none; visibility: visible; width: 79px; height: 20px;" class="" __idm_id__="824321"></iframe>
                            </span>
                        </div>
                    </div>
                    <iframe id="twitter-widget-1" scrolling="no" frameborder="0" allowtransparency="true" src="http://platform.twitter.com/widgets/tweet_button.b68aed79dd9ad79554bcd8c9141c94c8.en.html#_=1422079075304&amp;count=horizontal&amp;dnt=false&amp;id=twitter-widget-1&amp;lang=en&amp;original_referer=http%3A%2F%2Fwatchseries.ag%2Fserie%2Fthe_vampire_diaries&amp;size=m&amp;text=Watch%20The%20Vampire%20Diaries%20Serie%20Online%20-%20Watch%20Series&amp;url=http%3A%2F%2Fwatchseries.ag%2Fserie%2Fthe_vampire_diaries" class="twitter-share-button twitter-tweet-button twitter-share-button twitter-count-horizontal" title="Twitter Tweet Button" data-twttr-rendered="true" style="width: 107px; height: 20px;"></iframe>
                <script>!function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0],p=/^http:/.test(d.location)?'http':'https';if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src=p+'://platform.twitter.com/widgets.js';fjs.parentNode.insertBefore(js,fjs);}}(document, 'script', 'twitter-wjs');</script>

                    <br clear="all">

                    <b>Description :</b> 
                    The vampire brothers Damon and Stefan Salvatore, eternal adolescents, having been leading 'normal' lives, hiding their bloodthirsty condition, for centuries, moving on before their non-aging is noticed.
                    <span id="plot_mored"> They are back in the Virginia town where they became vampires. Stefan is noble, denying himself blood to avoid killing, and tries to control his evil brother Damon. Stefan falls in love with schoolgirl Elena, whose best friend is a witch, like her grandma.</span>
                    <a onclick="return showMoreContent('plot_mored');" class="small dark" href="#" id="more" style="display: none;">[+]more</a>

                    <br>

                    <p></p>
                </td>
            </tr>
        </tbody>
    </table>
</div>

Here is my PHP code:

$html = new simple_html_dom();

$html->load_file("LINK");    

foreach($html->find('div.show-summary table tbody tr') as $rowz){

     foreach($rowz->find('p') as $p){

        foreach($p->find('b') as $b){

            echo $b->innertext.'<br/>';
        }

    }
}

By running above code I get the following results:

Release Year:

Genre: Drama, Fantasy, Horror

External Links:

No. of episodes:

Latest Episode:

Latest Episode With Links:

Description :

I want to create an array of above image details.


Solution

  • Hello Every One now i have a complete solution i have do so many research code this it will do as i want here is this function checkit out please

     <?php
    
    function do_html_array($td,$dlm='<br>'){
        if(!empty($td)){
            $td = html_entity_decode($td);
            $td = preg_replace('/<script\b[^>]*>(.*?)<\/script>/is', "", $td);
            $html_array = explode($dlm,$td);
            $html_key_array = array();
            foreach($html_array as $key=>$html){
                    $html = explode(':',trim(strip_tags($html)));
                    if(trim($html[0])!=''){
                        if(count($html)<1) $html[1] = '';                   
                        if(strtolower(trim($html[0]))=='description') $html[1] = str_ireplace('[+]more','',$html[1]);
                        $html_key_array[strtolower(trim($html[0]))] = trim($html[1]);
                        switch(trim(strtolower($html[0]))){
                            case'external links':
                                 preg_match_all('~<a\s+.*?</a>~is',$html_array[$key],$html_key_array['imdb_link']);                          
                            break;
                            case'genre':
                                 preg_match_all('~<a\s+.*?</a>~is',$html_array[$key],$html_key_array['genre_link']);                             
                            break;
                            // further define here...
                        }
                    }
            }
            return $html_key_array;
        }
        return false; 
    }
    
    $td = '<td valign="top" style="padding:3px"><p><b>Release Year: </b>2007<br><b>Genre: <a href="/genres/Comedy">Comedy</a></b><br><b>External Links: </b> <a target="_blank" href="http://www.imdb.com/title/tt0898266/">IMDB</a>  <br><b>No. of episodes: </b> 178 episodes <br><b>Latest Episode: </b> <a href="/episode/big_bang_theory_s8_e16.html" title="Watch The Big Bang Theory Latest Episode (The Big Bang Theory Season 8 Episode 16)">Season 8 Episode 16 The Intimacy Acceleration (01/01/1970)</a><br><b>Latest Episode With Links: </b> <a href="/episode/big_bang_theory_s8_e13.html" title="Watch The Big Bang Theory Latest Episode (The Big Bang Theory Season 8 Episode 13)">Season 8 Episode 13 The Anxiety Optimization (15/01/2015)</a><br></p><div style="float: left; height: 30px; overflow: hidden; width: 100px;"><div data-show-faces="false" data-layout="button_count" data-send="false" data-href="http://watchseries.ag/serie/big_bang_theory" class="fb-like fb_iframe_widget" fb-xfbml-state="rendered" fb-iframe-plugin-query="app_id=434603673340441&amp;href=http%3A%2F%2Fwatchseries.ag%2Fserie%2Fbig_bang_theory&amp;layout=button_count&amp;locale=en_US&amp;sdk=joey&amp;send=false&amp;show_faces=false"><span style="vertical-align: bottom; width: 80px; height: 20px;"><iframe width="1000px" height="1000px" frameborder="0" name="f225e71df2e6d02" allowtransparency="true" scrolling="no" title="fb:like Facebook Social Plugin" style="border: medium none; visibility: visible; width: 80px; height: 20px;" src="http://www.facebook.com/plugins/like.php?app_id=434603673340441&amp;channel=http%3A%2F%2Fstatic.ak.facebook.com%2Fconnect%2Fxd_arbiter%2FDU1Ia251o0y.js%3Fversion%3D41%23cb%3Df1f47ad29892336%26domain%3Dwatchseries.ag%26origin%3Dhttp%253A%252F%252Fwatchseries.ag%252Ff18c568fa0d51e4%26relation%3Dparent.parent&amp;href=http%3A%2F%2Fwatchseries.ag%2Fserie%2Fbig_bang_theory&amp;layout=button_count&amp;locale=en_US&amp;sdk=joey&amp;send=false&amp;show_faces=false" class=""></iframe></span></div></div><iframe frameborder="0" id="twitter-widget-1" scrolling="no" allowtransparency="true" src="http://platform.twitter.com/widgets/tweet_button.67ae45a68af44ab435dd5797206058d3.en.html#_=1422780550826&amp;count=horizontal&amp;dnt=false&amp;id=twitter-widget-1&amp;lang=en&amp;original_referer=http%3A%2F%2Fwatchseries.ag%2Fserie%2Fbig_bang_theory&amp;size=m&amp;text=Watch%20The%20Big%20Bang%20Theory%20Serie%20Online%20-%20Watch%20Series&amp;url=http%3A%2F%2Fwatchseries.ag%2Fserie%2Fbig_bang_theory" class="twitter-share-button twitter-tweet-button twitter-share-button twitter-count-horizontal" title="Twitter Tweet Button" data-twttr-rendered="true" style="width: 109px; height: 20px;"></iframe><script>!function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0],p=/^http:/.test(d.location)?\'http\':\'https\';if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src=p+\'://platform.twitter.com/widgets.js\';fjs.parentNode.insertBefore(js,fjs);}}(document, \'script\', \'twitter-wjs\');</script><br clear="all"><b>Description :</b> A woman who moves into an apartment across the hall from two brilliant but socially awkward physicists shows them how little they know about life outside of the laboratory.<br><p></p></td>';
    
    $html_array = do_html_array($td);
    
    if($html_array){
        foreach($html_array as $key=>$value){
            if(is_array($value)){
                echo "<strong>$key</strong>:";
                foreach($value[0] as $link){
                    echo "$link , ";
                }
                echo "<br>--------------------------------<br>";
            }else{
                echo "<strong>$key</strong>: $value";
                echo "<br>--------------------------------<br>";
            }
        }
    }
    
    ?>
    

    My above function get all text and save them in array key value pairs :)