Started receiving this error when my cloud function is under heavy load, being triggered more than 5 times/sec.
Error: 9 FAILED_PRECONDITION: This transaction has been invalidated by
a later transaction in the same session.
I'm running a transaction both reading and writing with nodejs client.
The documentation says:
Operation was rejected because the system is not in a state required
for the operation's execution.
The transaction finnishes, and saves the data to Spanner, so transaction is not rejected. Anyone has a clue what this means?
Here's my code:
'use strict';
const { Spanner } = require('@google-cloud/spanner');
const constants = require('./constants');
const tableNames = constants.tableNames;
const spanner = new Spanner({ projectId: constants.projectId });
exports.spannerSync = async (data, context) => {
const instance = spanner.instance(constants.instanceId);
const database = instance.database(constants.databaseId);
try {
let profileExists;
const [qOneRows] = await exists(attrA, attrB, database);
if (qOneRows.length !== 0 && qOneRows !== undefined) {
profileExists = true;
}
upsertNewProfile(attrA, attrB, profileExists, database);
} catch (error) {
console.log(error);
}
};
function exists(attrA, attrB, database) {
const query = {
sql: {
DML_PROFILE_EXISTS
},
params: {
attr: attrA,
attrB: attrB
}
};
return database.run(query);
}
function deleteProfile(attrA, attrB, transaction) {
const query = {
sql: DML_DELETE_PROFILE,
params: {
attr: attrA,
attrB: attrB
}
};
return transaction.runUpdate(query);
}
function existsInTableB(attrB, attrA, database) {
const query = {
sql: EXISTS_DML_QUERY,
params: {
attrA: attrA,
attrB: attrB
}
}
return database.run(query);
}
async function upsertNewProfile(
attrA, attrB,
profileExists,
database
) {
let inActive = attrA.status;
database.runTransaction(async (err, transaction) => {
if (err) {
console.error('Error running transaction: ' + err);
return;
}
if (profileExists) {
try {
const deleted = await deleteProfile(attrA, attrB,
transaction);
} catch (error) {
console.log("ERROR " + error);
}
}
transaction.upsert(tableNames.A, attrA);
transaction.upsert(tableNames.B, attrB);
transaction.upsert(tableNames.C, attrB.attrA);
transaction.upsert(tableNames.D, attrB.attrB);
transaction.upsert(tableNames.E, attrB.attrC);
transaction.upsert(tableNames.F, attrB.attrD);
transaction.upsert(tableNames.G, attrB.attrE);
transaction.upsert(tableNames.H, attrB.attrF);
transaction.upsert(tableNames.I, attrB.attrG);
transaction.upsert(tableNames.J, attrB.attrH);
transaction.upsert(tableNames.F, attrB.attrI);
if (inActive) {
let [pidExistInBL] = await existsInTableB(attrA, attrB, database);
if (pidExistInBL.length == 0) {
transaction.upsert(tableNames.B, getObject(attrA, attrB));
}
}
try {
transaction.commit((error) => {
if (!error) {
console.log("Tables where successfully updated");
} else {
console.log("ERROR " + error);
}
});
} catch (error) {
console.log("ERROR " + error);
} finally {
transaction.end();
}
});
}
Sorry that try/catch got lost when writing the code to SO. It's there in the code and I catch the error, but am not rolling back the transaction if deleteProfile fails.
There are also some other details I didn't mention 😊. There's another error that appeared as well, saying that I've reached my quota limit of cpu for cloud functions, and the function is set to retry, (I've since extended the quota limit) and the problem went away.
So what I suspect is happening is that when the cloud function reaches the quota limit, and deleteProfile fails, the functions retries over and over. Because the number of executions and Spanner sessions skyrocket when reaching the quota limit. Still a bit confused, if it's retrying and opening more sessions and then the transaction in every cloud function is retying the query, would that be a reasonable explanation for seeing spikes in Spanner CPU-utilization at the same time?
Then I should also apply a roll-back of the transaction in the catch clause around deleteProfile right?
Cloud Spanner can only execute one transaction at a time on a session. The error message you are getting indicates that you have started a second transaction on a session that already had an active transaction.
The error message is returned when you try to execute a statement/query on the first transaction that is no longer valid. The second transaction will normally succeed.
It sounds like you are running into some kind of race condition / session leak problem. Do you have a code example or some more details of the code that produces this error?
EDIT after code sample
I wonder whether this problem might occur if the
const deleted = await deleteProfile(attrA, attrB, transaction)
fails, as that would terminate the transaction without explicitly ending the transaction. If it is a retryable error, a new transaction will automatically be retried on the same session. As the initial transaction has not explicitly been ended, that transaction could generate the error message you describe, while the retryed transaction would succeed.
You could try to put that statement in a similar try-catch-block as your commit and see if that solves it.