mysql slave replication failed

I have a situation where dns server get the records for its master and all the records are being replicated from master to slave and slave is being used for the resolution. The replication broke after the mysql server upgrade. The mysql server stopped and name of the log file and log position changed until mysql was recovered. Now i know if i change the log position and log file name, the replication will start but i will miss a lot of updates and that i don't want. What should i do to restart the master slave replication without losing any updates on the master. Every single update is important. Here are some information from slave status.

 Slave_IO_Running: No
 Slave_SQL_Running: Yes

Last_IO_Errno: 1236
Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Could not find first log file name in binary log index file'

Thanks

Solution

You may be in for a nice surprise but here is goes:

Run SHOW SLAVE STATUS\G. For the sake of example, let's say you get this:

             Slave_IO_State: Waiting for master to send event
                Master_Host: 10.64.68.253
                Master_User: replusername
                Master_Port: 3306
              Connect_Retry: 60
            Master_Log_File: mysql-bin.003202
        Read_Master_Log_Pos: 577991837
             Relay_Log_File: relay-bin.010449
              Relay_Log_Pos: 306229695
      Relay_Master_Log_File: mysql-bin.003202
           Slave_IO_Running: Yes
          Slave_SQL_Running: Yes
            Replicate_Do_DB:
        Replicate_Ignore_DB:
         Replicate_Do_Table:
     Replicate_Ignore_Table: 
    Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table: 
                 Last_Errno: 0
                 Last_Error:
               Skip_Counter: 0
        Exec_Master_Log_Pos: 577991837
            Relay_Log_Space: 306229695
            Until_Condition: None
             Until_Log_File:
              Until_Log_Pos: 0
         Master_SSL_Allowed: No
         Master_SSL_CA_File:
         Master_SSL_CA_Path:
            Master_SSL_Cert:
          Master_SSL_Cipher:
             Master_SSL_Key:
      Seconds_Behind_Master: 0

You choose the following from the display:

Relay_Master_Log_File (mysql-bin.003202)
Exec_Master_Log_Pos (577991837)

Here is why: Relay_Master_Log_File and Exec_Master_Log_Pos represent binlog entry from the Master that made it to the Slave and was executed successfully. Simply pickup from there.

You would simply run this code:Exec_Master_Log_Pos

STOP SLAVE;
CHANGE MASTER TO
MASTER_LOG_FILE='mysql-bin.003202',
MASTER_LOG_POS=577991837;
START SLAVE;

Give it a Try !!!

CAVEAT

If Relay_Master_Log_File no longer exists on the Master, you may have to do some damage control. Given the SHOW SLAVE STATUS\G mentioned before, you may have to skip to the next binary log on the Master as follows:

STOP SLAVE;
CHANGE MASTER TO
MASTER_LOG_FILE='mysql-bin.003203',
MASTER_LOG_POS=4;
START SLAVE;

If replication catches up, you are not out of the woods just. You may have to download Percona Toolkit and run pt-table-checksum and pt-table-sync to repair the lost data on the Slave.

If replication does not get off the ground, you will have to perform due diligence and reload the Slave.

Hopefully, you may not have to do anything in this caveat if replication works with the original suggestion.