Search code examples
mongodbmongodb-querymongoexport

How to extract specific fields from Mongo DB collection


I have 2250 records in my mongo collection.

Below is one record from that collection -

"_id" : bjectId("57e57e3fb04c6373f7000002"),
"message" : "<logentry   revision='15234'><author>447085</author><date>2016-07-19T12:39:19.707782Z</date><paths><path   prop-mods='false'   text-mods='true'   kind='file'   action='M'>/itdp/branches/itdpux/branches/base/itdp2.0/src/com/cts/race/beans/ProgramChronicleBean.java</path></paths><msg>day week month function addition </msg></logentry>",
    "@version" : "1",
    "@timestamp" : ISODate("2015-09-23T19:10:54.824Z"),
    "path" : "C:/DevInsight/svnpredictor/svn/svn.log",
    "host" : "WIN-5BRSCLOQIVN",
    "type" : "XML",
    "author" : "447085",
    "revision" : "15234",
    "date" : "2016-07-19T12:39:19.707782Z",
    "paths" : { "path" : [ 
            {   "action" : "M",
                "kind" : "file",
                "prop-mods" : "false",
                "text-mods" : "true",
                "content" : "/itdp/branches/itdpux/branches/base/itdp2.0/src/com/cts/race/beans/ProgramChronicleBean.java"
            } ] }

I want to extract revision,content fields of a record within certain date range. The mongoexport should be stored to a CSV and with fields as revision_id,file_name . I have tried below command

C:\mongodb\bin\mongoexport --db dbname --collection cname -f 'revision,paths.path.content' --query "{'date': { '$lt': {'$date' : ISODate('%1')} , '$gte': {'$date': ISODate('%2') }}}"  --out "C:\test\mongodata.csv"

I get below output with the above command-

{"_id":{"$oid":"57e57e3fb04c6373f7000003"},"paths":{"path":[{ ///whole paths tag content/// }]}

My actual output should be as below -

revision_id,file_name 15234,/itdp/branches/itdpux/branches/base/itdp2.0/web/xhtml/progchronicle_iux.xhtml

One good thing with the command is i am able to extract all the records which are withing the date range i specified.

Kindly, check my code and help me


Solution

  • Your paths.path element is an array. If you want to export it correctly you have to use the following command:

    C:\mongodb\bin\mongoexport --db dbname --collection cname -f 'revision,paths.path.0.content' --query "{'date': { '$lt': {'$date' : ISODate('%1')} , '$gte': {'$date': ISODate('%2') }}}"  --out "C:\test\mongodata.csv"
    

    Obliviously there would be some problems if paths.path is an array with random length. In this case you have to write your own script using a loop.