I am running Kafka Streams 3.1.0
on AWS OCP
cluster, and I am facing this error during restart of the pod:
10:33:18,529 [INFO ] Loaded Kafka Streams properties {topology.optimization=all, processing.guarantee=at_least_once, bootstrap.servers=PLAINTEXT://app-kafka-headless.app.svc.cluster.local:9092, state.dir=/var/data/state-store, metrics.recording.level=INFO, consumer.auto.offset.reset=earliest, cache.max.bytes.buffering=10485760, producer.compression.type=lz4, num.stream.threads=3, application.id=AppProcessor}
10:33:18,572 [ERROR] Error changing permissions for the directory /var/data/state-store
java.nio.file.FileSystemException: /var/data/state-store: Operation not permitted
at java.base/sun.nio.fs.UnixException.translateToIOException(Unknown Source)
at java.base/sun.nio.fs.UnixException.rethrowAsIOException(Unknown Source)
at java.base/sun.nio.fs.UnixException.rethrowAsIOException(Unknown Source)
at java.base/sun.nio.fs.UnixFileAttributeViews$Posix.setMode(Unknown Source)
at java.base/sun.nio.fs.UnixFileAttributeViews$Posix.setPermissions(Unknown Source)
at java.base/java.nio.file.Files.setPosixFilePermissions(Unknown Source)
at org.apache.kafka.streams.processor.internals.StateDirectory.configurePermissions(StateDirectory.java:154)
at org.apache.kafka.streams.processor.internals.StateDirectory.<init>(StateDirectory.java:144)
at org.apache.kafka.streams.KafkaStreams.<init>(KafkaStreams.java:867)
at org.apache.kafka.streams.KafkaStreams.<init>(KafkaStreams.java:851)
at org.apache.kafka.streams.KafkaStreams.<init>(KafkaStreams.java:821)
at org.apache.kafka.streams.KafkaStreams.<init>(KafkaStreams.java:733)
at com.xyz.app.kafka.streams.AbstractProcessing.run(AbstractProcessing.java:54)
at com.xyz.app.kafka.streams.AppProcessor.main(AppProcessor.java:97)
10:33:18,964 [INFO ] Topologies:
Sub-topology: 0
Source: app-stream (topics: [app-app-stream])
--> KSTREAM-AGGREGATE-0000000002
Processor: KSTREAM-AGGREGATE-0000000002 (stores: [KSTREAM-AGGREGATE-STATE-STORE-0000000001])
--> none
<-- app-stream
10:33:18,991 [WARN ] stream-thread [main] Failed to delete state store directory of /var/data/state-store/AppProcessor for it is not empty
On OCP cluster, the user running the app is provided by the cluster, and the state store is provided by an persistent volume (allowing pod to restart on same context), so the /var/data/state-store/
folder have following permissions drwxrwsr-x. (u:root g:1001030000)
:
1001030000@app-processor-0:/$ ls -al /var/data/state-store/
total 24
drwxrwsr-x. 4 root 1001030000 4096 Mar 21 10:43 .
drwxr-xr-x. 3 root root 25 Mar 23 11:04 ..
drwxr-x---. 2 1001030000 1001030000 4096 Mar 23 11:04 AppProcessor
drwxrws---. 2 root 1001030000 16384 Mar 21 10:36 lost+found
1001030000@app-processor-0:/$ chmod 750 /var/data/state-store/
chmod: changing permissions of '/var/data/state-store/': Operation not permitted
POD manifest relevant parts are:
spec:
containers:
- name: app-processor
volumeMounts:
- mountPath: /var/data/state-store
name: data
securityContext:
capabilities:
drop:
- KILL
- MKNOD
- SETGID
- SETUID
securityContext:
fsGroup: 1001030000
runAsUser: 1001030000
seLinuxOptions:
level: s0:c32,c19
volumes:
- name: data
persistentVolumeClaim:
claimName: data-app-processor-0
How to handle that ?
Should we use a subPath
on volumeMount
?
Thanks for your insights.
As suggested, the fix I found was to set a subPath
below the mountPath:
Here is the relevant part of helm template used:
spec:
containers:
- name: app-processor
volumeMounts:
- name: data
mountPath: {{ dir .Values.streams.state_dir | default "/var/data/" }}
subPath: {{ base .Values.streams.state_dir | default "state-store" }}
Where .Values.streams.state_dir
is mapped to stream property state.dir
.
Note this value is mandatory, and must be initialized in the values.
In that case the state-store
directory is created by securityContext.runAsUser
user, instead of root, so the org.apache.kafka.streams.processor.internals.StateDirectory
class can enforce the permissions.