Search code examples
phpmysqldebuggingrepair

Find "problematic" rows in a mysql table that will fail to export


I wanted to backup my database with PHP.

I tested the linked script but it was never ending, I tried to prepend the repair $table before the query but it didn't help.

So I figured out if I just skip two tables (you can see in the code) then it works fine:

<?

error_reporting(E_ALL);
ini_set('error_reporting',1);
require('../basedatos.php');

echo 'included<br>';
/* backup the db OR just a table */
function backup_tables($host,$user,$pass,$name,$tables = '*')
{


    echo '1<br>';
    //get all of the tables
    if($tables == '*')
    {
        $tables = array();
        $result = mysql_query('SHOW TABLES') or die(msyql_error());
        while($row = mysql_fetch_row($result))
        {
            $tables[] = $row[0];
        }
    }
    else
    {
        $tables = is_array($tables) ? $tables : explode(',',$tables);
    }
    echo '2<br>';
    //cycle through
    foreach($tables as $table)
    {
        if($table == 'etiquetas' || $table == 'links') continue;
        $repair = mysql_query("REPAIR table $table") or die(mysql_error());
        echo '3- '.$table.'<br>';
        $result = mysql_query('SELECT * FROM '.$table) or die(msyql_error());
        $num_fields = mysql_num_fields($result);

        $return.= 'DROP TABLE '.$table.';';
        $row2 = mysql_fetch_row(mysql_query('SHOW CREATE TABLE '.$table))  or die(msyql_error());
        $return.= "\n\n".$row2[1].";\n\n";

        for ($i = 0; $i < $num_fields; $i++) 
        {
            while($row = mysql_fetch_row($result))
            {
                $return.= 'INSERT INTO '.$table.' VALUES(';
                for($j=0; $j<$num_fields; $j++) 
                {
                    $row[$j] = addslashes($row[$j]);
                    $row[$j] = ereg_replace("\n","\\n",$row[$j]);
                    if (isset($row[$j])) { $return.= '"'.$row[$j].'"' ; } else { $return.= '""'; }
                    if ($j<($num_fields-1)) { $return.= ','; }
                }
                $return.= ");\n";
            }
        }
        $return.="\n\n\n";

    }
    echo '4<br>';
    //save file
    $handle = fopen('db-backup-'.time().'-'.(md5(implode(',',$tables))).'.sql','w+');
    fwrite($handle,$return);
    fclose($handle);
}
backup_tables('localhost','username','password','*');
?>

Is there any way to find the rows that are giving me a problem so I can edit/delete them?

-PS-

Also, I don't get any errors if I don't skip them (the script just never gets to the end, that's why I added some ugly logs.., any idea why?

-EDIT-

Also, if I try to export the database via, for example, sqlBuddy I also get errors:

enter image description here


Solution

  • As stated by many, this script (and the simple "MySQL dump via PHP" thing) is far from optimal, but still better than no backup at all.

    Since you can only use PHP to access the database you should use it to find out what is going wrong.

    Here is an adaptation of your script, that will dump only one table to a file. It's a debug script, not an export tool for production (however, do what you want with it), that's why it outputs debug after saving every single row of the table.

    As suggested by Amit Kriplani, data is appended to destination file at each iteration, but I don't think PHP memory is your problem, you should get a PHP error if you run out of memory, or at least a HTTP 500 should be thrown by the server instead of running the script forever.

    function progress_export( $file, $table, $idField, $offset = 0, $limit = 0 )
    {
    
        debug("Starting export of table $table to file $file");
    
        // empty the output file
        file_put_contents( $file, '' );
        $return = '';
    
        
        debug("Dumping schema");
    
        $return.= 'DROP TABLE '.$table.';';
        $row2 = mysql_fetch_row(mysql_query("SHOW CREATE TABLE $table"));
        $return.= "\n\n".$row2[1].";\n\n";
    
        
        file_put_contents( $file, $return, FILE_APPEND );
    
        debug("Schema saved to $file");
    
    
    
    
        $return = '';
    
        debug( "Querying database for records" );
    
        $query = "SELECT * FROM $table ORDER BY $idField";
    
        // make offset/limit optional if we need them for further debug
        if ( $offset && $limit )
        {
            $query .= " LIMIT $offset, $limit";
        }
    
        $result = mysql_query($query);
        
        $i = 0;
        while( $data = mysql_fetch_assoc( $result ) )
        {
            // Let's be verbose but at least, we will see when something goes wrong
            debug( "Exporting row #".$data[$idField].", rows offset is $i...");
    
            $return = "INSERT INTO $table (`". implode('`, `', array_keys( $data ) )."`) VALUES (";
            $coma = '';
    
            foreach( $data as $column )
            {
                $return .= $coma. "'". mysql_real_escape_string( $column )."'";
                $coma = ', ';
            }
    
            $return .=");\n";
    
            file_put_contents( $file, $return, FILE_APPEND );
    
            debug( "Row #".$data[$idField]." saved");
    
            $i++;
            
            // Be sure to send data to browser
            @ob_flush();
        }
    
        debug( "Completed export of $table to file $file" );
    }
    
    
    
    function debug( $message )
    {
        echo '['.date( "H:i:s" )."] $message <br/>";
    }
    
    
    // Update those settings :
    
    $host = 'localhost';
    $user = 'user';
    $pass = 'pwd';
    $base = 'database';
    
    // Primary key to be sure how record are sorted
    $idField = "id"; 
    
    $table   = "etiquetas";
    
    // add some writable directory
    $file = "$table.sql";
    
    
    $link = mysql_connect($host,$user,$pass);
    mysql_select_db($base,$link); 
    
    
    
    
    // Launching the script
    progress_export( $file, $table, $idField );
    

    Edit the settings at the end of script and run it against one of your two tables.

    You should see output while the script is still processing, and get some references about rows being processed, like this :

    [23:30:13] Starting export of table ezcontentobject to file ezcontentobject.sql

    [23:30:13] Dumping schema

    [23:30:13] Schema saved to ezcontentobject.sql

    [23:30:13] Querying database for records

    [23:30:13] Exporting row #4, rows offset is 0...

    [23:30:13] Row #4 saved

    [23:30:13] Exporting row #10, rows offset is 1...

    [23:30:13] Row #10 saved

    [23:30:13] Exporting row #11, rows offset is 2...

    [23:30:13] Row #11 saved

    [23:30:13] Exporting row #12, rows offset is 3...

    [23:30:13] Row #12 saved

    [23:30:13] Exporting row #13, rows offset is 4...

    [23:30:13] Row #13 saved

    etc.

    If the script completes...

    well you will have a backup of your table (beware, I did not test the generated SQL) !

    I guess it won't complete :

    If the script does not reach the first "Exporting row..." debug statement

    then the problem is at the query time.

    You should then try to limit the query with offset and limit parameters, proceed with dichotomy to find out where it hangs

    Example generating a query limited to the 1000 first results.

    // Launching the script
    progress_export( $file, $table, $idField, 0, 1000 );
    

    If the script shows some rows being exported before hanging

    before incriminating the last row id displayed, you should try to :

    1. Run the script again, to see if it hangs on the same row. This is to see if we are facing a "random" issue (it is never really random).

    2. Add an offset to the function call (see optional parameters), and run the script a third time, to see if it still hangs on the same row.

    for example 50 as offset, and some big number as limit :

    // Launching the script
    progress_export( $file, $table, $idField, 50, 600000 );
    

    This is to check if the row it self is causing the issue, or if it is a critical number of rows / amount of data...

    • If the same last row comes back every time, please inspect it and give us feed back.

    • If adding an offset change the last processed row, in a predictable way, we likely face a resources issue somewhere.

    The solution will then be to split export into chunks if you can't play on allocated resources. You can accomplish this with a script close to this one, outputing some HTML/javascript, to redirect to it self, with offset and limit as parameters, while export is not finish (I'll edit the answer if it is what we eventually need)

    • If the row changes almost every time, it's going to be more complicated...

    Some clues :

    I don't have any experience with VPS, but don't you have some limitations on CPU usage ?

    Could it be that your process is queued is some way if you use too much ressources at a time ?

    What about tables that are dumped without issue ? Are there tables as large as the two causing the issue ?