Search code examples
pythonjsonpandasgoogle-app-enginegoogle-cloud-platform

converting google datastore query result to pandas dataframe in python


I need to convert a Google Cloud Datastore query result to a dataframe, to create a chart from the retrieved data. The query:

def fetch_times(limit):
    start_date = '2019-10-08'
    end_date = '2019-10-19'
    query = datastore_client.query(kind='ParticleEvent')
    query.add_filter(
        'published_at', '>', start_date)
    query.add_filter(
        'published_at', '<', end_date)
    query.order = ['-published_at']
    times = query.fetch(limit=limit)
    return times

creates a json like string of the results for each entity returned by the query:

  • Entity('ParticleEvent', 5942717456580608) {'gc_pub_sub_id': '438169950283983', 'data': '605', 'event': 'light intensity', 'published_at': '2019-10-11T14:37:45.407Z', 'device_id': 'e00fce6847be7713698287a1'}>

Thought I found something that would translate to json which I could convert to dataframe, but get an error that the properties attribute does not exist:

def to_json(gql_object):
    result = []
    for item in gql_object:
        result.append(dict([(p, getattr(item, p)) for p in item.properties()]))
    return json.dumps(result, cls=JSONEncoder)

Is there a way to iterate through the query results to get them into a dataframe either directly to a dataframe or by converting to json then to dataframe?


Solution

  • Datastore entities can be treated as Python base dictionaries! So you should be able to do something as simple as...

    df = pd.DataFrame(datastore_entities)
    

    ...and pandas will do all the rest.

    If you needed to convert the entity key, or any of its attributes to a column as well, you can pack them into the dictionary separately:

    for e in entities:
        e['entity_key'] = e.key
        e['entity_key_name'] = e.key.name  # for example
    
    df = pd.DataFrame(entities)