Search code examples
javamongodbapache-sparkhadoop-streaming

mongo-hadoop connector:how to query data


I'm using hadoop mongo connector in java(spark application).I've done reading mongo db by setting this configuration

Configuration mongodbConfig = new Configuration();
mongodbConfig.set("mongo.job.input.format", "com.mongodb.hadoop.MongoInputFormat");
mongodbConfig.set("mongo.input.uri", "mongodb://localhost:27017/MyCollectionName.collection");

What can I add to query the data (like .limit(100000))


Solution

  • You can add more parameters in the config example:

    mongodbConfig.set("mongo.input.query", "{'field':'value'}");
    

    see https://github.com/mongodb/mongo-hadoop/wiki/Configuration-Reference for more details