Search code examples
javamacosmetadatahfs+xattr

How to store a hash in extended file attributes on OS X with Java?


Preface I am working on a platform in-depended media database written in java where the media files are identified by a file hash. The user shall be able to move the files around, so I do NOT want to rely on any file path. Once imported, I store the path and the hash in my database. I developed a fast file-hash-id algorithm based on a tradeoff between accuracy and performance, but fast is not always fast enough. :)

In order to update and import mediafiles, I need to (re)create the file hashes of all files in my library. My idea is now to calculate the hash just once and store it in the files metadata (extended attributes) to boost performance on filesystems which support extended file attributes. (NTFS, HFS+, ext3...) I already implemented it, and you can find the current source here: archimedesJ.io.metadata

Attempts At a first glance, Java 1.7 offers with the UserDefinedFileAttributeView a nice way to handle metadata. For most platforms this works. Sadly, UserDefinedFileAttributeView does not work on HFS+. Albeit, I do not understand why especially the HFS+ filesystem is not supported - it is one of the leading formats for metadata? (see related Question - which does not provide any solution)

How to store extended file attributes on OS X with Java? In oder to come by this java limitation, I decided to use the xattr commandline tool present on OSX and use it with Javas Process handling to read the output from it. My implementation works, but it is very slow. (Recalculation of the file hash is faster, how ironic! I am testing on a Mac BookPro Retina, with an SSD.)

It turned out, that the xattr tool works quite slow. (Writing is damn slow, but more importantly also reading an attribute is slow) To prove that it is not a Java issue but the tool itself, I have created a simple bash script to use the xattr tool on several files which have my custom attribute:

FILES=/Users/IsNull/Pictures/
for f in $FILES
do
  xattr -p vidada.hash $f
done

If I run it, the lines appear "fast" after each other, but I would expect to show me the output immediately within milliseconds. A little delay is clearly visible and thus I guess the tool is not that fast. Using this in java gives me an additional overhead of creating a process, parsing the output which makes it even a bit slower.

Is there a better way to access the extended attributes on HFS+ with Java? What is a fast way to work with the extended attributes on OS X with Java?


Solution

  • I have created a JNI wrapper for accessing the extended attributes now directly over the C-API. It is a open source Java Maven project and avaiable on GitHub/xattrj

    For reference, I post the interesting source pieces here. For the latest sources, please refer to the above project page.

    Xattrj.java

    public class Xattrj {
    
        /**
         * Write the extended attribute to the given file
         * @param file
         * @param attrKey
         * @param attrValue
         */
        public void writeAttribute(File file, String attrKey, String attrValue){
            writeAttribute(file.getAbsolutePath(), attrKey, attrValue);
        }
    
        /**
         * Read the extended attribute from the given file
         * @param file
         * @param attrKey
         * @return
         */
        public String readAttribute(File file, String attrKey){
            return readAttribute(file.getAbsolutePath(), attrKey);
        }
    
        /**
         * Write the extended attribute to the given file
         * @param file
         * @param attrKey
         * @param attrValue
         */
        private native void writeAttribute(String file, String attrKey, String attrValue);
    
        /**
         * Read the extended attribute from the given file
         * @param file
         * @param attrKey
         * @return
         */
        private native String readAttribute(String file, String attrKey);
    
    
        static {
            try {
                System.out.println("loading xattrj...");
                LibraryLoader.loadLibrary("xattrj");
                System.out.println("loaded!");
            } catch (Exception e) {
                e.printStackTrace();
            }
        }
    }
    

    org_securityvision_xattrj_Xattrj.cpp

    #include <jni.h>
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    #include "org_securityvision_xattrj_Xattrj.h"
    #include <sys/xattr.h>
    
    
    /**
     * writeAttribute
     * writes the extended attribute
     *
     */
    JNIEXPORT void JNICALL Java_org_securityvision_xattrj_Xattrj_writeAttribute
        (JNIEnv *env, jobject jobj, jstring jfilePath, jstring jattrName, jstring jattrValue){
    
        const char *filePath= env->GetStringUTFChars(jfilePath, 0);
        const char *attrName= env->GetStringUTFChars(jattrName, 0);
        const char *attrValue=env->GetStringUTFChars(jattrValue,0);
    
        int res = setxattr(filePath,
                    attrName,
                    (void *)attrValue,
                    strlen(attrValue), 0,  0); //XATTR_NOFOLLOW != 0
        if(res){
          // an error occurred, see errno
            printf("native:writeAttribute: error on write...");
            perror("");
        }
    }
    
    
    /**
     * readAttribute
     * Reads the extended attribute as string
     *
     * If the attribute does not exist (or any other error occurs)
     * a null string is returned.
     *
     *
     */
    JNIEXPORT jstring JNICALL Java_org_securityvision_xattrj_Xattrj_readAttribute
        (JNIEnv *env, jobject jobj, jstring jfilePath, jstring jattrName){
    
        jstring jvalue = NULL;
    
        const char *filePath= env->GetStringUTFChars(jfilePath, 0);
        const char *attrName= env->GetStringUTFChars(jattrName, 0);
    
        // get size of needed buffer
        int bufferLength = getxattr(filePath, attrName, NULL, 0, 0, 0);
    
        if(bufferLength > 0){
            // make a buffer of sufficient length
            char *buffer = (char*)malloc(bufferLength);
    
            // now actually get the attribute string
            int s = getxattr(filePath, attrName, buffer, bufferLength, 0, 0);
    
            if(s > 0){
                // convert the buffer to a null terminated string
                char *value = (char*)malloc(s+1);
                *(char*)value = 0;
                strncat(value, buffer, s);
                free(buffer);
    
                // convert the c-String to a java string
                jvalue = env->NewStringUTF(value);
            }
        }
        return jvalue;
    }
    

    Now the makefile which has troubled me quite a bit to get things working:

    CC=gcc
    LDFLAGS= -fPIC -bundle
    CFLAGS= -c -shared -I/System/Library/Frameworks/JavaVM.framework/Versions/Current/Headers -m64
    
    
    SOURCES_DIR=src/main/c++
    OBJECTS_DIR=target/c++
    EXECUTABLE=target/classes/libxattrj.dylib
    
    SOURCES=$(shell find '$(SOURCES_DIR)' -type f -name '*.cpp')
    OBJECTS=$(SOURCES:$(SOURCES_DIR)/%.cpp=$(OBJECTS_DIR)/%.o)
    
    all: $(EXECUTABLE)
    
    $(EXECUTABLE): $(OBJECTS)
        $(CC) $(LDFLAGS) $(OBJECTS) -o $@
    
    $(OBJECTS): $(SOURCES)
        mkdir -p $(OBJECTS_DIR)
        $(CC) $(CFLAGS) $< -o $@
    
    
    
    clean:
        rm -rf $(OBJECTS_DIR) $(EXECUTABLE)