Search code examples
csvgoogle-cloud-platformgoogle-cloud-datastoregoogle-cloud-storagegcsfuse

How to read from Google Cloud Storage CSV and Load it to Google Data Store


Can you please let me know how to read from Google Storage (CSV) file to Cloud Data Store.

I had done, Java code via App Engine and can able to load one row at a invocation. Same way i would like to get some sample code to load bulk ( read from CSV) to Data Store in a single go.


Solution

  • Can you please let me know how to read from Google Storage (CSV) file to Cloud Data Store.

    There are two different workarounds that you could use in order to read from Google Cloud Storage and loaded into your Cloud Datastore project.

    Using Apache Beam

    As mentioned in this similar post you could use Apache Beam to read the CSV file using the TextIO class.

    Next, you will need to apply a transformation that will parse each row in the CSV file and return an Entity object.

    In the post you will find an example of how to construct an Entity object based on a CSV file.

    Lastly, write the Entity objects to Cloud Datastore.

    Using Dataflow

    You could use Dataflow. Google provides a set of open-source Dataflow templates that you could use in order to achieve what you are looking for.

    As far as the templates is concerned, you could use the Cloud Storage Text to Datastore.

    The Cloud Storage Text to Datastore template is a batch pipeline which reads from text files stored in Cloud Storage and writes JSON encoded Entities to Datastore. Each line in the input text files should be in JSON format specified in https://cloud.google.com/datastore/docs/reference/rest/v1/Entity .

    I highly recommend going with the first option since it looks good and it got approved.

    If it does not work you can always try to use Dataflow templates to read from Google Cloud Storage into Cloud Datastore.

    I hope it helps.