I have been searching various sources but it is not clear to this neewbie. How do I load data (CSV file) from Cloud Storage to Cloud Datastore from an AppEngine PHP application? I do have an existing method which downloads the file and then loads each row as a transaction. It takes a few hours for a few million rows so this does not seem the best method and have been searching for a more efficient method. I appreciate any guidance.
Editing this as I have switched to trying to use a remote URL from which to load the JSON data into Datastore from GAE. Code is not working though I do not know why (yet):
<?php
require 'vendor/autoload.php';
use Google\Auth\ApplicationDefaultCredentials;
use Google\Cloud\Datastore\DatastoreClient;
/**
* Create a new product with a given SKU.
*
* @param DatastoreClient $datastore
* @param $sku
* @param $product
* @return Google\Cloud\Datastore\Entity
*/
function add_product(DatastoreClient $datastore, $sku, $product)
{
$productKey = $datastore->key('SKU', $sku);
$product = $datastore->entity(
$productKey,
[
'created' => new DateTime(),
'name' => strtolower($product)
]);
$datastore->upsert($product);
return $product;
}
/*
Load Cloud DataStore Kind from remote URL
@param $projectId
@param $url
*/
function load_datastore($projectId, $url) {
// Create Datastore client
$datastore = new DatastoreClient(['projectId' => $projectId]);
// Enable `allow_url_fopen` to allow reading file from URL
ini_set("allow_url_fopen", 1);
// Read the products listing and load to Cloud Datastore.
// Use batches of 20 for a transaction
$json = json_decode(file_get_contents($url), true);
$count = 1;
foreach($json as $sku_key => $product_val) {
if ($count == 1) {
$transaction = $datastore->transaction();
}
add_product($datastore, $sku_key, $product_val);
if ($count == 20) {
$transaction->commit();
$count = 0;
} catch (Exception $err) {
echo 'Caught exception: ', $err->getMessage(), "\n";
$transaction->rollback();
}
$count++;
}
}
try
{
$projectId = 'development';
$url = 'https://raw.githubusercontent.com/BestBuyAPIs/open-data-set/master/products.json';
load_datastore($projectId, $url);
} catch (Exception $err) {
echo 'Caught exception: ', $err->getMessage(), "\n";
$transaction->rollback();
}
?>
This question is similar to Import CSV into google cloud datastore and Google Cloud Datastore: Bulk Importing w Node.js .
The quick answer is that you can use Apache Beam or Cloud Dataflow to import CSV data into Cloud Datastore.