I got connectors from https://cloud.google.com/hadoop/datastore-connector
But I'm trying to add the datastore-connector
(and bigquery-connector
too) as a dependency in the pom... I don't know if it this is possible. I could not find the right artifact and groupId.
Is there some maven repository that contain the datastore-connector
?
Furthermore, I am looking for the source of datastore-connector
, but I didn't find it. By the notes in the CHANGES.txt
, it seems to be coming from:
https://github.com/GoogleCloudPlatform/bigdata-interop
The source should be in the package com.google.cloud.hadoop.io.datastore
(src/main/***/com/google/cloud/hadoop/io/datastore/
) but it's not there.
In fact, the source of bigquery-connector
appears to be on GitHub along with its pom, but is the source of datastore-connector
available?
What David says in the other answer is correct. To elaborate more, the connector under the hood uses the Protocol Buffers SDK, and uses, for example, the QuerySplitter to define splits. In the near future, we will be posting more information to gcp-hadoop-announce with further guidance regarding the future of the Datastore connector for Hadoop.
You may want to familiarize yourself with other Datastore features that may suit your purposes better, including Datastore backup to GCS, and this codelab walking through an AppEngine-friendly approach to extracting data from Datastore and loading it into BigQuery for analysis. You may notice at the top of that page an announcement of trusted-tester availability for direct backend loading of Datastore backups into BigQuery.