amazon-s3 apache-kafka apache-kafka-connect s3-kafka-connector

Partitioning with key in kafka connect s3 sink

Can we partition our output in s3 sink connector with key? How can we set in connector config to just hold latest 10 record of each key or just hold data of 10 minutes ago? or partitioning with key and time period.

Solution

You'd need to set store.kafka.keys=true for the S3 sink to store keys, by default, but those will be written to unique files separately from the value, and within whatever partitioner you've configured.

Otherwise, the FieldPartitioner only uses the value of the record, Therefore, you'd need an SMT to move the record key into the value in order to partition on it.

Last I checked, there is still an open PR on Github for a Field and Time partitioner.

The S3 sink doesn't window/compact any data, it'll dump and store everything. You'll need an external process such as a Lambda function to cleanup data over time

How to generate an Amazon's S3 presigned URL
AWS Lambda function and S3 - change metadata on an object in S3 only if the object changed
How to configure IntelligentTieringConfiguration for an AWS::S3::Bucket resouce?
Cloudformation : Encountered unsupported property DeletionPolicy on S3 Bucket
How to loop through values in a CloudFormation template
EC2 to S3 uploads fail randomly
Do I need AmazonECSTaskExecutionRolePolicy as a task role in aws ecs faragate
'Suspicious Operation: Attempted access to "" denied' while loading static files
Unable to execute HTTP request error when trying to list buckets for an AmazonS3 client
How can we enable Amazon S3 replication modification sync in terraform?
How to Mock Aws\S3\S3Client for phpunit // how to mock magic methods
How to incorporate AWS S3 plugin into Grails app?
Is there optimistic locking in AWS S3?
Syntax error at or near ':'(line 1, pos 2) - PARSE_SYNTAX_ERROR - == SQL ==
How do I get my Angular app deployed to S3 respond to routing URLs
Connection Reset during large file uploads to AWS s3 using Multer-s3
Upload via presigned request - 403 forbidden for Unicode Filename
Using a wildcard on S3 Event Notification prefix
Amazon EFS vs S3: Which is better for appending small chunks of data to large files?
S3 - Access-Control-Allow-Origin Header
aws s3api restore-object permission error
The file size is zero when using the S3 CopyObject API
Connection refused error (127.0.0.1:4566) for Localstack S3
Problem with file directory link in amazon s3
How can I create a smart-tier bucket?
localstack AWS S3 javascript error : getaddrinfo ENOTFOUND bucketname.localhost
How to provide multiple StringNotEquals conditions in AWS policy?
Unable to upload data using sagemaker_session.upload_data in s3 bucket that I created, it is getting stored in default s3 bucket
Restarting AWS lambda function to clear cache
Best approach to upload the file via REST api from API gateway