We have a bucket where there are 40+ million objects. We would like to read each of these objects extract particular data from a column and store them as separate files.
How to get all these object lists and process each of them? Can we get this list of objects in pub/sub
you have not specified many details so please specified the following . 1.in what language . 2.what type of data to extract and where/how do you want to save/store(if/or any specific file format) them.
meanwhile here is a simple code snippet .
exports.iterateThroughFiles = functions.https.onRequest(async (req, res) => {
try {
const bucketName = 'your-bucket-name';
const bucket = storage.bucket(bucketName);
const [files] = await bucket.getFiles();
files.forEach((file) => {
// Perform operations on each file
console.log(`File name: ${file.name}`);
});
res.status(200).send('Files iteration completed successfully.');
} catch (error) {
console.error('Error iterating through files:', error);
res.status(500).send('An error occurred while iterating through files.');
}
});
if this is it then ok but for any specific approach req. more clarification.
of course you can use pub/sub but
if the task is long running then i will suggest you use appengine or compute as the firebase function will not suitable as crossing max time limit