Search code examples
amazon-web-servicesamazon-s3mounts3fs

What does s3fs cache in /tmp?


I'm using s3fs to mount a lot of files to an S3 bucket. It works fine except the fact that my local disk space is also growing a lot (the content in the /tmp directory).

My command is:

$ su ec2-user -c '/usr/bin/s3fs my-bucket-name -o use_cache=/tmp /home/ec2-user/dir'`

I'm using the use_cache parameter but what is actually cached? Are this files which still need to be uploaded to s3 and are cached on my local machine? Can I just delete it during upload/mount or not? And will my upload go quicker if I turn it off (if it's for other purposes)?


Solution

  • From the s3fs wiki (which is a bit hard to find).

    If enabled via "use_cache" option, s3fs automatically maintains a local cache of files in the folder specified by use_cache. Whenever s3fs needs to read or write a file on s3 it first downloads the entire file locally to the folder specified by use_cache and operates on it. When fuse release() is called, s3fs will re-upload the file to s3 if it has been changed. s3fs uses md5 checksums to minimize downloads from s3. Note: this is different from the stat cache (see below).

    Local file caching works by calculating and comparing md5 checksums (ETag HTTP header).

    The folder specified by use_cache is just a local cache. It can be deleted at any time. s3fs re-builds it on demand. Note: this directory grows unbounded and can fill up a file system dependent upon the bucket and reads to that bucket.