i'm scraping data using the simple html dom. However there seem to be some weird behaviour. it only add 3 records even though it writes test 6 times. How come it is looping 6 times and only adding 3 rows?
include('simple_html_dom.php');
$html = file_get_html("http://www.dailydot.com/tags/counter-strike/");
foreach($html->find("//li[@class='span4']") as $element) {
echo "test";
$title = strip_tags($element->find("//a[@class='article-title']/h3", 0));
$img = $element->find("//div[@class='picfx']/a/img[@class='lzy-ld']", 0)->getAttribute('data-original');
$link = $element->find("//a[@class='article-title']", 0)->href;
$date = $element->find("//p[@class='byline']/time", 0)->datetime;
mysqli_query($con, "INSERT INTO news (`title`, `url`, `image_url`, `news_text`, `referer_img`) VALUES ('$title', '$link', '$img', '$full_text_strip', 'test')");
}
Probably because it fails 3 times :D thoses insert aren't injection safe. You should use real escape string. If you don't your code will fail if any of your variables contains a simple quote. (And it allow a bad guy to inject sql commands)
include('simple_html_dom.php');
$html = file_get_html("http://www.dailydot.com/tags/counter-strike/");
foreach($html->find("//li[@class='span4']") as $element) {
$title = mysqli_real_escape_string($con, strip_tags($element->find("//a[@class='article-title']/h3", 0)));
$img = mysqli_real_escape_string($con, $element->find("//div[@class='picfx']/a/img[@class='lzy-ld']", 0)->getAttribute('data-original'));
$link = mysqli_real_escape_string($con, $element->find("//a[@class='article-title']", 0)->href);
$date = mysqli_real_escape_string($con, $element->find("//p[@class='byline']/time", 0)->datetime);
mysqli_query($con, "INSERT INTO news (`title`, `url`, `image_url`, `news_text`, `referer_img`) VALUES ('$title', '$link', '$img', '$full_text_strip', 'test')");
echo "test ".mysqli_error($con);
}