Copy/move semantics on FUSE

I have a hash-value database with tags and I want to implement a FUSE interface for it. Because values are indexed by their hashes they must be read-only.

Native interface for this database is very simple:

You can download, upload or tag a file.
You can get the set of all defined tags.
You can search for files tagged in accordance to a boolean combination of tags.

FUSE interface semantics are simple:

Database is viewed as a big synthetic directory hierarchy where values are files named by its hash and tags are directories.
cd-ing inside a directory is semantically equivalent to search for a given tag (naming conventions on paths can be used to implement boolean operations).
read-ing a file is semantically equivalent to download (part of) a value (FUSE allows an stateless read so open and close can be no-ops).
Copying/moving an inexistent file into a given path is equivalent to upload and tag it. Copying/moving an existent file into a given path is equivalent to add new tags.
Any other operation throws an error.

This FUSE interface is quite usable and allows you to easily embed a tag file system inside a hierarchical one without the need of external tools like TagSpaces or Evernote.

My problem arises identifying a file copy or move from any other forbidden operation with FUSE interface: there are endless possible combination of operations with equivalent semantics.

What is the most reliable way to identify a file copy or move with FUSE interface?

Solution

Hooking rename of a file should be straightforward by implementing rename() fuse call. In this call, you will get path of both old and new location, so that you can check if the file comes from outside or not. That said, this would work only if user space tool renames a file by invoking rename(2) kernel call.

On the other hand, hooking file copy operation would be harder: it can't be done directly as there is no such fuse call - copying happens in user space completely and so it's not directly detectable in kernel space.

You could try to do some heuristics and process incoming fuse operations to detect rename of already stored file (eg. by hashing content of new file and comparing that with already existing files), but I'm not sure how much it makes sense in your case or if it would be actually practical.