I'm quite new to MongoDB and a bit confused. I'm trying to create on AWS an automated backup routine to be used in production, and I want to make sure I'm doing it correctly.
By now I have set a Replica Set with 1 Arbiter and 3 Members (1 primary, 1 secondary, 1 hidden delayed by 4 hours). Each member has 3 separate EBS volumes (data 100GB, journal 20GB, log 10GB).
I created a Lambda Function with NodeJS that run every hour (with CloudWatch Event) to take a snapshot, that performs the following operations:
MongoClient
connects to the hidden delayed member mongodb://admin:[email protected]:27017/admin
db.command({ fsync: 1, lock: true })
LambdaSnapshot
:const ec2Conn = new AWS.EC2({ region: 'us-west-1' })
const params = {
Filters: [
{
Name: "tag-key",
Values: ["LambdaSnapshot"],
},
],
};
const volumes = (await ec2Conn.describeVolumes(params).promise()).Volumes;
const volumeIds = volumes.map((volume) => volume.VolumeId);
return Promise.all(
volumeIds.map(async (volumeId) => {
const formattedDate = moment().format("DD/MM/YYYY HH:mm:ss");
const snapshot = await ec2Conn
.createSnapshot({
Description: "Snapshot " + volumeId + " taken on " + formattedDate,
VolumeId: volumeId,
TagSpecifications: [
{
ResourceType: "snapshot",
Tags: [
{
Key: "Name",
Value: volume + " " + formattedDate,
},
{
Key: snapshotTag,
Value: volume + " " + formattedDate,
},
],
},
],
})
.promise();
return snapshot.SnapshotId;
});
);
db.command({ fsyncUnlock: 1 })
I have 2 main doubts.
LambdaSnapshot
only the EBS Volume containing DATA (the one of 100GB) of the hidden delayed member. I'm not sure if I have to take snapshot of the journal and the log volumes as well.await
to run the command to create the snapshot, the function continues anyway and unlock the instance while the snapshot is in pending
state. I'm not sure, but I think that the command createSnapshot()
give only the input to AWS to start the snapshot and resolve the promise without waiting for complete. So I'm in doubts if I have to unlock the db outside the lambda function once the snapshot complete; in that case I don't know how to listen for the complete event to run a second lambda function that unlock the db.Thanks in advance
As stated in docs EBS snapshot creation is asynchronous:
Snapshots occur asynchronously; the point-in-time snapshot is created immediately, but the status of the snapshot is pending until the snapshot is complete (when all of the modified blocks have been transferred to Amazon S3), which can take several hours for large initial snapshots or subsequent snapshots where many blocks have changed. While it is completing, an in-progress snapshot is not affected by ongoing reads and writes to the volume.
Backup documentation says:
To get a correct snapshot of a running mongod process, you must have journaling enabled and the journal must reside on the same logical volume as the other MongoDB data files. Without journaling enabled, there is no guarantee that the snapshot will be consistent or valid.
Unless you have another documentation reference saying you do NOT need to back up log or journal data with all the other data, I suggest backing up everything together.