Search code examples
node.jsgoogle-cloud-platformgoogle-cloud-pubsub

Google Cloud Function doesn't publish on PubSub, Timeout exceeded


On GCP (Google Cloud Platform), I've got a database stored on cloud SQL.

I have two Cloud functions:

  1. make a request to the database (this request can have more than 1000 results) then publish the result on PubSub
  2. web scraping thanks to puppeteer.

First cloud function's code:

//[Requirement]
const mysql = require('mysql')
const {SecretManagerServiceClient} = require('@google-cloud/secret-manager')
const ProjectID = process.env.secretID
const SqlPass = `projects/xxx`
const client = new SecretManagerServiceClient()
const {PubSub} = require('@google-cloud/pubsub');
const pubSubClient = new PubSub();
const topicName = "xxxxxx";
//[/Requirement]

exports.LaunchAudit = async () => {
    const dbSocketPath = "/cloudsql"
    const DB_USER = "xxx"
    const DB_PASS = await getSecret()
    const DB_NAME = "xxx"
    const CLOUD_SQL_CONNECTION_NAME = "xxx"
  
    //[SQL CONNEXION]
    let pool = mysql.createPool({
      connectionLimit: 1,
      socketPath: `${dbSocketPath}/${CLOUD_SQL_CONNECTION_NAME}`,
      user: DB_USER,
      password: DB_PASS,
      connectTimeout: 500,
      database: DB_NAME
    })
    //[/SQL CONNEXION]
    //set the request
    let sql = `select * from * where *;`
      //make the setted request 
    await pool.query(sql, async (e,results) => {
      //if there is an error send it
        if(e){
          throw e
        }
        //for each result of the query, log it and publish on PubSub ("Audit-property" topic)
        results.forEach(async element => {
          console.log(JSON.stringify(element))
          await msgPubSub(JSON.stringify(element))
        })
    })    
}

async function msgPubSub(data){
  //console.log(data)
  const messageBuffer = Buffer.from(data)
  try {
    const topicPublisher = await pubSubClient.topic(topicName).publish(messageBuffer)
    console.log("Message id: " + topicPublisher)
  } catch (error) {
    console.error(`Error while publishing message: ${error.message}`)
  }
}

Problems:

  1. Firstly, when it works, it takes a long time to publish the first message on the PubSub topic, something like 6 minutes. Why there is this delay?

  2. Secondly, when I do a big request (something like 500+ results) I've got a Timeout error:

    Total timeout of API google.pubsub.v1.Publisher exceeded 600000 milliseconds before any response was received.
    

I've tried to publish a batched message, add some memory to the cloud functions, use google-gax, but got the same result.

I'm using Node.js v10.

Second cloud function's message part code:

exports.MainAudit =  async message => {
    const property = Buffer.from(message.data, 'base64').toString()
    const pProperty = JSON.parse(property)
    console.log(property)
}

package.json dependencies:

  "dependencies": {
    "@google-cloud/pubsub": "^2.6.0",
    "@google-cloud/secret-manager": "^3.2.0",
    "google-gax": "^2.9.2",
    "mysql": "^2.18.1",
    "node-fetch": "^2.6.1"
  }

Log + Timestamp: log


Solution

  • As the code is now, you are creating a new instance of a publisher for each message you publish. This is because pubSubClient.topic(topicName) creates an instance for publishing to the topic. Therefore, you are paying the overhead of establishing a connection for each message you send. Instead, you'd want to create that object a single time and reuse it:

    const pubSubClient = new PubSub();
    const topicName = "xxxxxx";
    const topicPublisher = pubSubClient.topic(topicName)
    

    However, this still leaves an inefficiency in your application where you are waiting for each message to publish before starting the next publish due to the use of await on the publish call and the call to msgPubSub. The Pub/Sub client library can batch messages togetherr for more efficient sending, but you'd need to allow multiple calls to publish to be outstanding to take advantage of it. You'd want to await on a Promise.all of the list of promises returned from publishing.