Search code examples
dspace

How to empty the DSpace assetstore?


For testing purposes I have made a clone of a DSpace 5.5 server.
To spare disk room used by this clone I have removed a collection which contained several thousand items.
After this action the assetstore directory is still very full, although there is only one collection left with just one item as content.
How can I empty the assetstore from the items belonging to the removed collection?


Solution

  • The dspace cleanup command line script removes deleted bitstreams from the assetstore.

    https://wiki.duraspace.org/display/DSDOC5x/Storage+Layer#StorageLayer-Cleanup

    dspace/bin/dspace cleanup -h
    usage: Cleanup
     -h,--help      Help
     -l,--leave     Leave database records but delete file from assetstore
     -v,--verbose   Provide verbose output
    

    edit (may 19): If you've got a massive amount of deleted bitstreams the command can take a long time to complete. There is another way:

    $ psql -c "select internal_id from bitstream where deleted=true" > deleted_bitstreams   
    $ while read internal_id; do rm $HOME/dspace/assetstore/${internal_id:0:2}/${internal_id:2:2}/${internal_id:4:2}/$internal_id; done < deleted_bitstreams
    

    You can make sure the paths are fine by running the command with ls instead of rm first