I have a folder with parquet files.
How to read them all and convert into 1 big txt file?
I am using parquetjs
library to read 1 file:
(
async () => {
// create new ParquetReader that reads from 'fruits.parquet`
let reader = await parquet.ParquetReader.openFile('fruits.parquet');
// create a new cursor
let cursor = reader.getCursor();
// read all records from the file and print them
let record = null;
while (record = await cursor.next()) {
console.log(record);
}
}
) ();
Need help with reading several files at once and combining them..
aynsc
function to take a filename
parameter. Make the function return the record
filename
Array.map
to transform the filename
array into a Promise
arrayPromise.all
to wait for all files to be readString.join
to combine all the record
s into a one stringasync
function to take a filename
Convert the async
file to take a filename
parameter
const readFile = async(filename) => {
let reader = await parquet.ParquetReader.openFile(filename);
let cursor = reader.getCursor();
let record = '';
let currentContent = '';
while (currentContent = await cursor.next()) {
record += currentContent;
}
return record;
};
const filenames = ['f1.parquet', 'f2.parquet', 'f3.parquet'];
const readPromises = filenames.map(f => readFile(f));
const allPromises = Promise.all(readPromises);
// Read and combine
allPromises.then(contentsArray => contentsArray.join('\n'))
.then(joinedContent => console.log(joinedContent))
.catch(console.error);