Search code examples
jsontwitter4jmongodb-java

Nested Json in Mongodb java


I have more than 100o document in json format (tweets). I have to extract hashtags from these documents. I am reading this file through mongodb-java driver.

entities=Document{
  {
    urls=[

    ],
    hashtags=[
      Document{
        {
          indices=[
            89,
            104
          ],
          text=Hungry4Science
        }
      },
      Document{
        {
          indices=[
            105,
            112
          ],
          text=ASCO16
        }
      }
    ]}}

I have to get text from this structure then I will insert into my mongo collection. Each tweet has hashtag entity but I cant read the lower level objects.

        Document hash = (Document)old_status.get("entities");
        new_status.append("hastags", hash.get("hashtags"));

Instead of getting text, I got whole document as my output:

    hashtags=[
  Document{
    {
      indices=[
        73,
        80
      ],
      text=cancer
    }
  },
  Document{
    {
      indices=[
        81,
        90
      ],
      text=moonshot
    }
  },
  Document{
    {
      indices=[
        125,
        133
      ],
      text=pallonc
    }
  }
]

I tried like this but no luck. Any help please.


Solution

  •         Document entity = (Document)old_status.get("entities");
            ArrayList<Document> hashlist =(ArrayList<Document>) entity.get("hashtags");
            ArrayList<String> hashtaglist = new ArrayList<String>();
            for(Document hashtag:hashlist){
                String g = hashtag.getString("text");
                hashtaglist.add(g);
            }new_status.append("hashtags",hashtaglist);    collection.insertOne(new_status);
    

    This program gets all text object from hashtag and save into arraylist!!!