Search code examples

Find not working for GridFS after updating metadata

I added a step in my application to persist files via GridFS and added a metadata field called "processed" to work as a flag for a scheduled task that retrieves the new file and sends it on for processing. Since the Java driver for GridFS doesn't have a method allowing metadata to be updated I used MongoCollection for the "fs.files" collection to update "metadata.processing" to true.

I use GridFSBucket.find(eq("metadata.processed", false) to get the new files for processing and then update metadata.processed to true once processing is completed. This works if I add a new file while the application is running. However, if I have an existing file with "metadata.processed" set to false and start the application, the above find call returns no results. Similarly if I have a file that was already processed and I set the "metadata.processed" field back to false, the above find call also ceases working.

private static final String FILTER_STR = "'{'\"filename\" : \"{0}\"'}'";

private static final String UPDATE_STR =
        "'{'\"$set\": '{'\"metadata.processed\": \"{0}\"'}}'";

private GridFSBucketFactory gridFSBucketFactory;

private MongoCollectionFactory mongoCollectionFactory;

public void storeFile(String filename, DateTime publishTime,
        InputStream inputStream) {

    if (fileExists(filename)) {"File named {} already exists.", filename);
    } else {
        uploadToGridFS(filename, publishTime, inputStream);"Stored file named {}.", filename);

public GridFSDownloadStream getFile(BsonValue id) {
    return gridFSBucketFactory.getGridFSBucket().openDownloadStream(id);

public GridFSDownloadStream getFile(String filename) {
    final GridFSFile file = getGridFSFile(filename);
    return file == null ? null : getFile(file.getId());

public GridFSFindIterable getUnprocessedFiles() {
    return gridFSBucketFactory.getGridFSBucket()
            .find(eq("metadata.processed", false));

public void setProcessed(String filename, boolean isProcessed) {
    final BasicDBObject filter =
            BasicDBObject.parse(format(FILTER_STR, filename));
    final BasicDBObject update =
            BasicDBObject.parse(format(UPDATE_STR, isProcessed));
    if (updateOne(filter, update)) {"Set metadata for {} to {}", filename, isProcessed);

private void uploadToGridFS(String filename, DateTime publishTime,
        InputStream inputStream) {
            inputStream, createMetadata(publishTime));

private GridFSUploadOptions createMetadata(DateTime publishTime) {
    final Document metadata = new Document();
    metadata.put("processed", false);
    // metadata.put("publishTime", publishTime.toString());
    return new GridFSUploadOptions().metadata(metadata);

private boolean fileExists(String filename) {
    return getGridFSFile(filename) != null;

private GridFSFile getGridFSFile(String filename) {
    return gridFSBucketFactory.getGridFSBucket()
            .find(eq("filename", filename)).first();

private boolean updateOne(BasicDBObject filter, BasicDBObject update) {

    try {
                update, new UpdateOptions().upsert(true));
    } catch (final MongoException e) {
                "The following failed to update, filter:{0} update:{1}",
                filter, update, e);
        return false;
    return true;

Any idea what I can do to ensure:

GridFSBucket.find(eq("metadata.processed", false) 

returns the proper results for existing files and/or files that have had the metadata changed?


  • The issue was due to setting the metadata.processed value as a String vs a boolean.

    When initially creating the metadata I set its value with a boolean:

    private GridFSUploadOptions createMetadata(DateTime publishTime) {
        final Document metadata = new Document();
        metadata.put("processed", false);
        // metadata.put("publishTime", publishTime.toString());
        return new GridFSUploadOptions().metadata(metadata);

    And later I check for a boolean:

    public GridFSFindIterable getUnprocessedFiles() {
        return gridFSBucketFactory.getGridFSBucket()
            .find(eq("metadata.processed", false));

    But when updating the metadata using the "fs.files" MongoCollection I incorrectly added quotes around the boolean value here:

    private static final String UPDATE_STR =
        "'{'\"$set\": '{'\"metadata.processed\": \"{0}\"'}}'";

    Which caused the metadata value to be saved as a String vs a boolean.