Search code examples
node.jsamazon-web-servicesamazon-s3amazon-glacier

Moving files from S3 Glacier to S3 Standard valid xml with NodeJS


I have successfully used copyObject to move a file from S3 Standard to S3 Glacier. Now I am trying to use restoreObject to be able to move the file back from S3 Glacier to S3 Standard.

I have created the params variable to pass to the restoreObject based on what I found at restoreObject documentation but when I try to call restore I get:

The XML you provided was not well-formed or did not validate against our published schema.

Here is the definition I am using:

const glacierRestore = {
        Bucket: process.env.AWS_BUCKET,
        Key: 'file name goes here',
        RestoreRequest: {
            OutputLocation: {
                S3: {
                    BucketName: process.env.AWS_BUCKET,
                    Prefix: 'X',
                    StorageClass: 'STANDARD'
                }
            },
            Tier: 'Standard'
        }
    };

What am I missing? I think I have all the required fields for what I am wanting to do. Is there a place I can validate my xml against that will give me more feedback then just that it is not valid?

I realize this is JSON but it must convert it to XML.


Solution

  • I have finally found a way to do what I wanted and thought I would document it at least so that if anyone else is trying to do something similar they might find some help in my solution.

    First I cannot restore using the OutputLocation in the RestoreRequest unless I am doing a select and but I am wanting to do a file restore so I had to do it differently.

    I used the s3.headObject() to get information back from the file so I could determine it's state. When the file is in the Standard Storage Class I would get back something like this:

    {
      AcceptRanges: 'bytes',
      LastModified: 2020-06-03T20:53:28.000Z,
      ContentLength: 147451,
      ETag: '"20573fb94e8c715dee562ce04b795708"',
      ContentType: 'application/octet-stream',
      Metadata: {}
    }
    

    Once I had moved it to Glacier Storage Class I would get something like:

    {
      AcceptRanges: 'bytes',
      LastModified: 2020-06-03T20:56:11.000Z,
      ContentLength: 147451,
      ETag: '"20573fb94e8c715dee562ce04b795708"',
      ContentType: 'application/octet-stream',
      Metadata: {},
      StorageClass: 'GLACIER'
    }
    

    Now there is a StorageClass added to the info returned. Now comes the fun part. I used the s3.restoreObject() to get the file from Glacier but this only creates a temporary copy (that defaults to 1 day) and that temporary copy is deleted after that period but keeps the copy in Glacier. I wanted to have a copy in Standard and delete the copy in Glacier I had to try and figure out when the files had been restored so that I could copy them to Standard and delete them from Glacier.

    The problem is Glacier takes by default 3-5 hours before the files are where you can make a copy to Standard so I had to create a process to do this. I have a process that checks every 5 minutes and see if there are any files that have been restored temporarily so that I can copy them and delete the Glacier copy. When a file is in the process of being retrieved from Glacier and copied temporarily the s3.headObject for the file return something like:

    {
      AcceptRanges: 'bytes',
      Restore: 'ongoing-request="true"',
      LastModified: 2020-06-03T20:56:11.000Z,
      ContentLength: 147451,
      ETag: '"20573fb94e8c715dee562ce04b795708"',
      ContentType: 'application/octet-stream',
      Metadata: {},
      StorageClass: 'GLACIER'
    }
    

    Now the file has a Restore tag that says the ongoing-request = "true" which let me know that the file was in the process of being restored from Glacier. Once the file was restored from Glacier (temporarily) the s3.headObject returned something like:

    {
      AcceptRanges: 'bytes',
      Restore: 'ongoing-request="false", expiry-date="Sun, 07 Jun 2020 00:00:00 GMT"',
      LastModified: 2020-06-03T20:56:11.000Z,
      ContentLength: 147451,
      ETag: '"20573fb94e8c715dee562ce04b795708"',
      ContentType: 'application/octet-stream',
      Metadata: {},
      StorageClass: 'GLACIER'
    }
    

    Now the Restore says that the ongoing-request = "false" so I know that the restore is complete and I can copy (s3.copyObject()) the file to Standard and delete the file from Glacier.