Search code examples
phphtmlmysqlweb-scraping

How to add data off HTML page into MySQL DB


How can I add data from a HTML page, into a MySQL Database based on the attributes?

It's already scraped data, but I would like to import links into a particular field in a table and remove some things from them (ill work that out) and another from into another field in a table.

I have PHP/MySQL and Linux. Should I use curl, and if so how do I actually add data into a MySQL DB?


Solution

  • Some PHP example to Insert and update data:

    //***************************************   
    // Connect to database
    //
    mysql_connect('127.0.0.1','MyUserName','MyPassword',false,MYSQL_CLIENT_SSL|MYSQL_CLIENT_COMPRESS);
    mysql_select_db('MyDatabase');
    
    // If you work with UTF-8 it would be a good idea to set the character set as well to be sure.
    //mysql_set_charset('utf8_general_ci');
    
    //***************************************   
    // Insert new data
    //
    $MyURL = mysql_real_escape_string("http://www.exampledomain.com/product/");
    $Result = mysql_query("INSERT INTO ProductTable (URLField) VALUES ('".$MyURL."')");
    if($Result)
     print mysql_affected_rows();
    
    //***************************************   
    // Update existing data
    //
    $MyURL = mysql_real_escape_string("http://www.exampledomain.com/newproduct/");
    $RecordID = 123;
    $Result = mysql_query("UPDATE ProductTable SET URLField='".$MyURL."' WHERE ID=".$RecordID);
    if($Result)
     print mysql_affected_rows();
    

    Connect to the MySQL Server with mysql_connect() and use mysql_select_db() to select the database. I'm not native English and use UTF-8 to get all the special character correct. If you have no need of this you can ignore this line.

    All data that goes into a SQL server should be be sanitized, meaning escaping control characters such as quotes. The Variable $MyURL is sanitized with mysql_real_escape_string() before it is used in the SQL statement.

    The SQL statement is executed with mysql_query() and returns true or false (for INSERT and UPDATE statements). With mysql_affected_rows() you can see how many rows that was affected by the SQL statement, a way to see if it worked as expected.

    Next comes an UPDATE example to change data in a single column and/or row. The $RecordID variable is the record ID you want to update (you need to know what record you want to update). This example is pinpointing a single record. By changing the WHERE clausule you can update a whole bunch of rows at the same time. For example

    UPDATE ProductTable SET URLField='".$MyURL."' WHERE URLField='http://www.exampledomain.com/oldproduct/'
    

    ...will update all rows that have 'http://www.exampledomain.com/oldproduct/' in the field URLField.

    I think this will get you going for a while...