Search code examples
pythonarraysjsongoogle-cloud-platformgoogle-cloud-spanner

Spanner SELECT statement as json array in Python


I am trying to run a select statement in spanner which returns multiple rows and put that result into json format.

Here is the code:

def fetchmessages(self, pushid):
            rc = ResultCode.SUCCESS
            messages = ""

            def selectmessages(transaction):
                    messages = transaction.execute_update(
                    "SELECT Message.message_text FROM Message LEFT JOIN MessageStatus "
                    "ON (Message.message_id = MessageStatus.message_id) WHERE "
                    "MessageStatus.push_id = @id",
                    params={'id' : (pushid)},
                    param_types={'id' : param_types.STRING}
            )

            try:
                    self.client.run_in_transaction(selectmessages)
            except Exception as fetchexception:
                    rc = ResultCode.ERR_NO_MSG_FOUND
                    self.logger.debug(fetchexception)
                    pass

            if ResultCode.SUCCESS:
                    output = "{ 'pushid':'" + pushid + "', 'messages':" + messages + ", 'resultcode':" + str(rc.value) + "}"

            return output

The calling function just runs the output through jsonify and prints it out. The above code prints out:

"{ 'pushid':'cdd92f4ce847efa5c7f', 'messages':, 'resultcode':1}"

I want it to print out something like this:

"{ 'pushid':'cdd92f4ce847efa5c7f', 'Messages': [{'message': 'foo bar'}, { 'message': 'core dump'}], 'resultcode': 1}"

How can I do this ?


Solution

    1. There's an issue in your JOIN statement:

      • each push_id contains several message_id, so both tables should be joined on push_id, not message_id
    2. Potential relationship problem between tables Message and MessageStatus:

      • push_id should be the primary key of MessageStatus, and MessageStatus the parent of Message, check if DDL is correctly set with this example:

    DDL:

    CREATE TABLE MessageStatus (
    push_id string(50) NOT NULL
    ) PRIMARY KEY (push_id);
    
    
    CREATE TABLE Message (
    push_id string(50), NOT NULL,
    message_id string(50) NOT NULL,
    message_text string(50)
    ) PRIMARY KEY (push_id, message_id),
    INTERLEAVE IN PARENT MessageStatus ON DELETE CASCADE;
    

    Query script:

    from google.cloud import spanner
    
    def fetch_messages(instance_id, database_id):
        spanner_client = spanner.Client()
        instance = spanner_client.instance(instance_id)
        database = instance.database(database_id)
    
        with database.snapshot() as snapshot:
            results = snapshot.execute_sql(
                'SELECT Message.push_id, Message.message_id, Message.message_text FROM Message LEFT JOIN MessageStatus '
                'ON MessageStatus.push_id = Message.push_id WHERE MessageStatus.push_id="JDHK65FTRD832KJ"')
            for row in results:
                print(u'push_id: {}, message_id: {}, message_text: {}'.format(*row))
    
    
    fetch_messages([INSTANCE_ID], [DATABASE_ID])
    

    Result:

    push_id: JDHK65FTRD832KJ, message_id: 1, message_text: First
    push_id: JDHK65FTRD832KJ, message_id: 2, message_text: Second
    push_id: JDHK65FTRD832KJ, message_id: 3, message_text: Third
    

    Resources on GCP: