Search code examples
phpsqlparsingrssfeed

parsing rss feed in php and dump to sql


found this floating script to dump given feed into sql. however unable to save feed description and images separately in table. below is the part of code which is currently saving some contents but with empty columns.

any help or advise would be appreciated.

    $db = mysql_connect($db_hostname,$db_username,$db_password);
if (!$db)
{
    die("Could not connect: " . mysql_error());
}
.....snip.............

EDIT#1:

studying simplepie script and its caching. will update should get it resolved tonight.

EDIT#2: SQL query which was modified to include description and screengrab for the table contents.

CREATE TABLE rss ( ............snip........................

enter image description here

EDIT #3: thx robert for correcting me yes it was a quick copy paste from the original script and havent put correct query init. i had updated sql query and get things fixed however now the moded script needs more juice by retrieving a image link along with the post. i am using xpath to get each post image link however not able to incorporate the respective image with each row dump. both scripts are running perfectly at their individual grounds but i need more help in merging them together.

        require_once("./config.php");

    .......snip................

EDIT#4: well it didnt work i mange to updated existing table with another script i upvoted your answer but havent accepted it as correct answer due to this reason. might i failed to explain properly but my script is working like a charm and just missing some CSS. thanks though for your help

NOTE: i snip original scripts from the post/


Solution

  • From your screenshot you're missing info inside "item_enclosure" and "item_status" columns, the rest is filled.

    Problem is that your query doesn't try to insert those missing entries

            $item_insert_sql = "INSERT INTO rssingest(item_id, feed_url, item_title, item_date, item_description, item_url, fetch_date) VALUES ('" . $item_id . "', '" . $feed_url . "', '" . $item_title . "', '" . $item_date . "', '" . $item_description . "', '" . $item_url . "', '" . $fetch_date . "')";
    

    ^ Will never insert "item_enclosure" and "item_status" since the 2 columns are not part of the insert query...

    You'd need to extract the 2 missing items and modify the insert query to:

            $item_insert_sql = "INSERT INTO rssingest(item_id, feed_url, item_title, item_date, item_description, item_url, fetch_date, item_enclosure, item_status) VALUES ('" . $item_id . "', '" . $feed_url . "', '" . $item_title . "', '" . $item_date . "', '" . $item_description . "', '" . $item_url . "', '" . $fetch_date . "', '" . $item_enclosure . "', '" . $item_status . "')";
    

    You'll need to first define:

    • $item_enclosure
    • $item_status

    Btw the query you pasted in your updated reply is wrong, it should start with "INSERT INTO" not "CREATE TABLE", you're not echoing the right query.

    UPDATE:

    In your original code there's this line

    $has_image = preg_match('/<img.+src=[\'"](?P<src>.+?)[\'"].*>/i', $RSSitem, $image);
    

    So right now if your code finds an image it should be putting it in an array called $image

    Try print_r of $image to see in which array item it's stored, I think it will be $image[0][0] or $image[0]

    So do the following to define the image URL (I'm guessing it's $image[0][0], but double check just in case):

    $img_url = (isset($image)) ? $image[0][0] : "";
    

    Then you'll need to update the insert query (i don't know what column name is used for the image so it's just an example):

        $item_insert_sql = "INSERT INTO rssingest(item_id, feed_url, item_title, item_date, item_description, item_url, fetch_date, item_enclosure, item_status, image_column) VALUES ('" . $item_id . "', '" . $feed_url . "', '" . $item_title . "', '" . $item_date . "', '" . $item_description . "', '" . $item_url . "', '" . $fetch_date . "', '" . $item_enclosure . "', '" . $item_status . "', '" . $img_url . "')";