Search code examples
javaocralfrescocommand-line-tool

Alfresco File Storage(alf_data)


I am in an situation where i need to run one command-line tool for file which is uploaded in alfresco repository.The reason behind this is i need to perform OCR on that particular file.

I know i can use transformation which alfresco by default provides.But transformation does not provides conversation between same mimetype and my requirement is like performing OCR on PDF File(which contains images) and again generate PDF File(Which contains extracted data).

My approach is to create a policy, when node is uploaded in alfresco repository. From that policy I will access the node which is uploaded in alfresco repository using java,Here is the problem ,I dont know under which location of alf_data directory the file is getting uploaded.As i need to get physical location of file.

By the way I am using linux system.

Can any one help on this?


Solution

  • You need to use the ContentService, specifically getReader(NodeRef,QName) then getContent(File) to a temporary file

    Your code would then be something like

    File tmp = File.createTempFile("for-ocr",".tmp");
    ContentReader reader = contentService.getReader(nodeRef, ContentModel.PROP_CONTENT);
    reader.getContent(tmp);
    // Run the OCR program here
    tmp.delete();