Search code examples
pythongoogle-app-enginebulkloader

Uploading data with bulkloader


In short: how can I configure bulkloader to insert data into 2 models with references?

I have a person and fruit class, with person linking to fruit:

class Fruit(db.Model): 
    name = db.StringProperty()
class Person(db.Model): 
    name = db.StringProperty() 
    customer = db.ReferenceProperty(Fruit)

And I want to upload this CSV data:

Name,Fruit
Bob,Banana
Joe,Apple
Tim,Banana

I tried using create_foreign_key as in the docs:

transformers:

- kind: fruit
  connector: csv
  property_map:
    - property: fruit
      external_name: Fruit

- kind: person
  connector: csv
  connector_options:
    encoding: utf-8
    columns: from_header
  property_map:
    - property: title
      external_name: Name
    - property: fruit
      external_name: Fruit
      import_transform: transform.create_foreign_key('fruit')

When I run the command:

appcfg.py upload_data --config_file=bulkloader.yaml --filename=food.csv --kind=person .

The persons are uploaded and they have foreign keys for the fruit, but the fruits entities they point to do not exist.

When I try --kind=fruit the fruit are uploaded, but there are many duplicates.

I am trying to link the person to fruit, with no duplicate fruit - is this possible through bulkloader?


Solution

  • I didn't figure out how to do this cleanly so ended up just splitting my data into multiple files and pregenerating the ID's.