I try extract data from website using curl and simple_htmlDOM. Data contains time table, lecture and teacher. Code normally works but it gives a internal error 500.
function parse($curl){
$html=new simple_html_dom();
$html->load($curl);
$legend=$html->find('div.mainpage',0)->children(6);//legenda
$table=$html->find('div.mainpage',0)->children(3);//table body
echo $table->outertext;
echo $legend->outertext;
echo "<p>";
foreach ($html->find('td.rozvrh-pred')as $subject){
$subjecttextname=$subject->children(0)->children(2)->innertext;
$subjecttextlecture=$subject->children(0)->children(5)->children(0)->innertext; //internal error point to this row to function children
echo $subjecttextname." : ".$subjecttextlecture."<br>";
}
echo "</p>";
}
Is there any way to fix this ? [UPDATE]
The data I am approaching looks like this:
<td class="" align="left"><small></small></td><td width="18" colspan="2" align="center" class="rozvrh-pred">
<small>
<a href="../mistnosti/?zobrazit_mistnost=922;zpet=../katalog/rozvrhy_view.pl?rozvrh_student=79992,zobraz=1;lang=en">ab300 (BA-MD-FEI A-B)</a><br/>
<a href="../katalog/syllabus.pl?predmet=313986;zpet=../katalog/rozvrhy_view.pl?rozvrh_student=79992,zobraz=1;lang=en">Algebraic structures</a>
<sup>(1)</sup><br />
<i><a href="../lide/clovek.pl?id=733;zpet=../katalog/rozvrhy_view.pl?rozvrh_student=79992,zobraz=1;lang=en">TEACHER</a></i>
</small>
</td>
But how can I aproach to the Text values for example Algerbraic Structures or Teacher?
Test every thing you get from simple html dom with is_object(). example:
$html = str_get_html($str_html);
if(!is_object($html)) {
//Log error or return error
return false;
}
$legend=$html->find('div.mainpage',0)->children(6);
if(!is_object($legend)) {
//Log error or return error
return false;
}
If it's not an object and you attempt further parsing with simple html dom then you will get a fatal error every time.