Search code examples
mongodbazureazure-cosmosdbazure-synapse

Azure MongoDB + Synapse Link: Column values unexpectedly contain JSON


I'm following this tutorial and have created an Azure MongoDB (Mongo API) and a Synapse Workspace, imported the ECDC data into MongoDB and connected it in Synapse Workspace. So far so good. However, when I query the data, for e.g. string column date_rep I get {"string":"2020-12-14"} instead of just 2020-12-14

The query I'm using is:

SELECT TOP 10 *
FROM OPENROWSET(​PROVIDER = 'CosmosDB',
                CONNECTION = 'Account=analytcstest;Database=ecdc',
                OBJECT = 'ecds',
                SERVER_CREDENTIAL = 'analytcstest'
)  with ( date_rep varchar(200) ) as rows

When don't specify the "with" clause to automatically infer the schema, I have the same problem:

SELECT TOP 10 *
FROM OPENROWSET(​PROVIDER = 'CosmosDB',
                CONNECTION = 'Account=analytcstest;Database=ecdc',
                OBJECT = 'ecds',
                SERVER_CREDENTIAL = 'analytcstest'
)  as rows

screenshot query

I could parse it of course, like this, but I don't understand why I have to do that and it's not in the docs?

SELECT TOP 10 JSON_VALUE([date_rep], '$.string') AS [date_rep]
FROM OPENROWSET(​PROVIDER = 'CosmosDB',
                CONNECTION = 'Account=analytcstest;Database=ecdc',
                OBJECT = 'ecds',
                SERVER_CREDENTIAL = 'analytcstest'
)  with ( date_rep varchar(200) ) as rows

Solution

  • I tried to reproduce similar thing in my environment and getting similar output A value with data type when I used the CosmosDB for Mongo API:

    enter image description here

    this can be because of the The Azure Cosmos DB API for MongoDB stores data in a document structure, via BSON format. It is a bin­ary-en­coded seri­al­iz­a­tion of JSON documents. BSON has been extended to add some optional non-JSON-native data types, like dates and binary data.

    The Cosmos DB API for MongoDB is meant for MongoDB experience and continue to use your favorite MongoDB drivers, SDKs, and tools by pointing your application to the API for MongoDB account's connection string.

    When I tried similar thing with the CosmosDB for SQL API it stores data in Json format, and it is giving appropriate result for that.

    enter image description here

    For more information on BSON refer this Document