Hadoop with MongoDB storage

I have a project to use NoSQL DB with Hadoop and benchmark it. I chose MongoDB as a database but I have been confused about something and have some questions that need to be clarified:

Will MongoDB be replacing HDFS or will they be working together and how?
Is benchmarking MongoDB alone different from doing it with Hadoop? Because I feel like at they are the same thing.
I found YCSB tool for benchmarking. Can it benchmark them together?
I know that MongoDB can work on cluster, when monogo on top of Hadoop , will the data be shared among nodes by MongoDB or by Hadoop?

I hope you clarify these concepts and thank you in advance.

Solution

Will MongoDB be replacing HDFS

Absolutely not. HDFS is not meant to be used as a database, and Mongo is not a distributed filesystem capable of storing Petabytes of any data

will they be working together and how?

HIve and Spark can read data from Mongo directly. I'm sure there's other tools that can backup Mongo into HDFS.

Is benchmarking MongoDB alone different from doing it with Hadoop

Yes, reads and writes will be vastly different tuning parameters than HDFS, because HDFS is not a database

YCSB tool for benchmarking

Its not clear what you're benchmarking in Hadoop. Writing and reading a bunch of files (with and without mapreduce)? Seeing how many jobs run in YARN at a given time? Hadoop again isn't a database meant to store simple JSON blobs.

when monogo on top of Hadoop , will the data be shared among nodes by MongoDB or by Hadoop?

I've never heard of this, but maybe indicies are stored by Mongo, and raw data served by HDFS?