I would like to write a json object to S3 in parquet using Amazon Lambda (python)!
However I cannot connect fastparquet lib with boto3 in order to do it since the first lib has a method to writo into a file and boto3 expect an object to put into the S3 bucket
Any suggestion ?
fastparquet example
fastparque.write('test.parquet', df, compression='GZIP', file_scheme='hive')
Boto3 example
client = authenticate_s3()
response = client.put_object(Body=Body, Bucket=Bucket, Key=Key)
the Body would correspond to the parquet content! and it would allow to write into S3
You can write any dataframe to S3 by using the open_with
argument of the write
method (see fastparquet's doc)
import s3fs
from fastparquet import write
s3 = s3fs.S3FileSystem()
myopen = s3.open
write(
'bucket-name/filename.parq.gzip',
frame,
compression='GZIP',
open_with=myopen
)