We have been using AWS RDS for many years without any issues. A few days ago we made various changes to a main DB which caused the read replica to struggle to catch up. This is not unusual and the usual response is to just delete the read replica and spin up a new one.
However, since this last issue we can no longer connect to it via mysqli from PHP. Connecting to it via SQLyog is fine, but this is of little use since its used via PHP.
It took the usual time for back up of main database, creation of replica and then a few hours for the replica to catch up. The replica lag has been 0-ish for days now, yet we can not connect (getting connection timed out). Reused the same DB name.
The main one is an m3.2xlarge, the replica is m3.xlarge. We always had this structure (got reserved instances running for both) and this has always worked fine. However, have also tried the same process with a replica of m3.2xlarge just for testing and that didnt work either. All tables are InnoDB.
We have done this many times in the past without any issues. Any ideas on what could suddenly be different?
After about ten days it resolved itself and everything went back to normal. Not been able to recreate the issue since, so guess just temporary issues.
Same happened again today. The "trick" seems to be to add something random in the security group, so it refreshes. Using the same security group for master and replica.