I'm using the Firebase Admin SDK inside a Firebase Functions to list the content of a directory on Cloud Storage. However my function takes quite often more than 5+ seconds to answer. At first I thought this was all due to cold starts of the functions itself and I've tried numerous things to prevent cold starts of the function such as:
minInstances
to 1 or moreus-central1
to europe-west3
)However none of the above worked and I still get slow response times when the function is called for the first time. After adding some logging inside the function it turns out, that the Admin SDK is taking a over a second to get a single value from Realtime Database and it's taking usually over 5 seconds to run the getFiles
command on Cloud Storage (The directory usually only contains a single file).
For example for the code below I'm getting the following console outputs:
listVideos: coldstart true
listVideos: duration 1: 0ms
listVideos: duration 2: 1302ms (realtime database)
listVideos: duration 3: 6505ms (getFiles on cloud storage)
listVideos: coldstart false
listVideos: duration 1: 0ms
listVideos: duration 2: 96ms (realtime database)
listVideos: duration 3: 199ms (getFiles on cloud storage)
My function looks like this:
import * as admin from "firebase-admin";
admin.initializeApp();
let coldStart = true;
exports.listVideos = functions.region("europe-west1").runWith({
memory: "128MB",
minInstances: 1,
}).https.onCall(async (data, context) => {
console.log("coldstart", coldStart);
coldStart = false;
const t1 = new Date().getTime();
if (context.auth) {
const authUid = context.auth.uid;
console.log(`duration 1: ${new Date().getTime() - t1}ms`);
const level = (await admin.database().ref(`users/${authUid}/`).once("value")).val();
console.log(`duration 2: ${new Date().getTime() - t1}ms (realtime database)`);
const [files] = await admin.storage().bucket()
.getFiles({
prefix: `${authUid}/video`,
});
console.log(`duration 3: ${new Date().getTime() - t1}ms (getFiles on cloud storage)`);
return {status: "success", files: files};
} else {
return {status: "error - not authenticated"};
}
});
I know I can't expect 0ms latency but for a simple getFiles
call I'd expect something under 1 second just like it is when the sdk is "warm" (considering that my whole bucket has less than 1000 files and the directory that I'd listing has only 1 file in it)
Cloud Storage isn't a database and is not optimized for queries (getFiles is effectively a query against the entire bucket using the prefix of the names of objects). It's optimized for storing and retrieving blobs of data at massive scale when you already know the name of the object.
If you want to list files quickly, consider storing the metadata of the files in a database that is optimized for the type of query that you want to perform, and link those records to your storage as needed. This is a fairly common practice in GCP projects.