Search code examples
dockerkubernetesapache-kafkaopenshiftksqldb

How to connect your KSQLDB-Cluster on OpenShift to an on-premise kerberized Kafka-cluster


What I want to achieve:
We have an on premise Kafka cluster. I want to set up KSQLDB in OpenShift and connect it to the brokers of the on premise Kafka cluster.

The problem:
When I try to start the KSQLDB server with the command "/usr/bin/ksql-server-start /etc/ksqldb/ksql-server.properties" I get the error message:

[2020-05-14 15:47:48,519] ERROR Failed to start KSQL (io.confluent.ksql.rest.server.KsqlServerMain:60)
io.confluent.ksql.util.KsqlServerException: Could not get Kafka cluster configuration!
        at io.confluent.ksql.services.KafkaClusterUtil.getConfig(KafkaClusterUtil.java:90)
        at io.confluent.ksql.security.KsqlAuthorizationValidatorFactory.isKafkaAuthorizerEnabled(KsqlAuthorizationValidatorFactory.java:81)
        at io.confluent.ksql.security.KsqlAuthorizationValidatorFactory.create(KsqlAuthorizationValidatorFactory.java:51)
        at io.confluent.ksql.rest.server.KsqlRestApplication.buildApplication(KsqlRestApplication.java:624)
        at io.confluent.ksql.rest.server.KsqlRestApplication.buildApplication(KsqlRestApplication.java:544)
        at io.confluent.ksql.rest.server.KsqlServerMain.createExecutable(KsqlServerMain.java:98)
        at io.confluent.ksql.rest.server.KsqlServerMain.main(KsqlServerMain.java:56)
Caused by: java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.TimeoutException: Call(callName=listNodes, deadlineMs=1589471268517) timed out at 1589471268518 after 1 attempt(s)
        at org.apache.kafka.common.internals.KafkaFutureImpl.wrapAndThrow(KafkaFutureImpl.java:45)
        at org.apache.kafka.common.internals.KafkaFutureImpl.access$000(KafkaFutureImpl.java:32)
        at org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:89)
        at org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:260)
        at io.confluent.ksql.services.KafkaClusterUtil.getConfig(KafkaClusterUtil.java:60)
        ... 6 more
Caused by: org.apache.kafka.common.errors.TimeoutException: Call(callName=listNodes, deadlineMs=1589471268517) timed out at 1589471268518 after 1 attempt(s)
Caused by: org.apache.kafka.common.errors.TimeoutException: Timed out waiting for a node assignment.

My configuration:
I set up my Dockerfile on the basis of this image: https://hub.docker.com/r/confluentinc/ksqldb-server, the ports 9092, 9093, 8080, 8082 and 443 are open.

My service-yaml looks like that:

kind: Service
apiVersion: v1
metadata:
  name: social-media-dev
  namespace: abc
  selfLink: xyz
  uid: xyz
  resourceVersion: '1'
  creationTimestamp: '2020-05-14T09:47:15Z'
  labels:
    app: social-media-dev
  annotations:
    openshift.io/generated-by: OpenShiftNewApp
spec:
  ports:
    - name: social-media-dev
      protocol: TCP
      port: 9092
      targetPort: 9092
      nodePort: 31364
  selector:
    app: social-media-dev
    deploymentconfig: social-media-dev
  clusterIP: XX.XX.XXX.XXX
  type: LoadBalancer
  externalIPs:
    - XXX.XX.XXX.XXX
  sessionAffinity: None
  externalTrafficPolicy: Cluster
status:
  loadBalancer:
    ingress:
      - ip: XX.XX.XXX.XXX

My ksql-server.properties file includes the following information:
listeners: http://0.0.0.0:8082
bootstrap.servers: X.X.X.X:9092, X.X.X.Y:9092, X.X.X.Z:9092

What I have tried so far:

I tried to connect from within my pod to a broker and it worked:
(timeout 1 bash -c '</dev/tcp/X.X.X.X/9092 && echo PORT OPEN || echo PORT CLOSED') 2>/dev/null
result: PORT OPEN

I also played around with the listener but then the error message got shorter just with the information "Could not get Kafka cluster configuration!" and without the timeout error.

I tried to exchange LoadBalancer to Nodeport, but also without success.

Do you have any ideas what I could try next?

UPDATE: With an upgrade to Cloudera CDH6, the Cloudera Kafka cluster works now also with Kafka Streams. Hence I was able to connect from my KSQLDB Cluster in Openshift to the on-premise Kafka cluster now.


Solution

  • UPDATE: With an upgrade to Cloudera CDH6, the Cloudera Kafka cluster works now also with Kafka Streams. Hence I was able to connect from my KSQLDB Cluster in Openshift to the on-premise Kafka cluster now.

    I will also describe my final way of connecting to the kerberized Kafka-cluster here as I have been struggling a lot to get it running:

    1. Getting Kerberos-tickets and establish connections via SSL

    ksql-server.properties (the sasl_ssl part of it):

    security.protocol=SASL_SSL
    sasl.mechanism=GSSAPI
    
    ssl.truststore.location=truststore.jks
    ssl.truststore.password=password
    ssl.truststore.type=JKS
    
    ssl.ca.location=cert
    
    sasl.jaas.config=com.sun.security.auth.module.Krb5LoginModule required useKeyTab=true storeKey=true keyTab="my.keytab" serviceName="kafka"  principal="myprincipal";
    serviceName="kafka"
    
    producer.ssl.endpoint.identification.algorithm=HTTPS
    producer.security.protocol=SASL_SSL
    producer.ssl.truststore.location=truststore.jks
    producer.ssl.truststore.password=password
    producer.sasl.mechanism=GSSAPI
    producer.sasl.jaas.config=com.sun.security.auth.module.Krb5LoginModule required useKeyTab=true storeKey=true keyTab="my.keytab" serviceName="kafka"  principal="myprincipal";
    
    consumer.ssl.endpoint.identification.algorithm=HTTPS
    consumer.security.protocol=SASL_SSL
    consumer.ssl.truststore.location=truststore.jks
    consumer.ssl.truststore.password=password
    consumer.sasl.mechanism=GSSAPI
    consumer.sasl.jaas.config=com.sun.security.auth.module.Krb5LoginModule required useKeyTab=true storeKey=true keyTab="my.keytab" serviceName="kafka"  principal="myprincipal";`
    
    1. Set up Sentry rules therefore

    HOST=[HOST]->CLUSTER=kafka-cluster->action=idempotentwrite

    HOST=[HOST]->TRANSACTIONALID=[ID]->action=describe

    HOST=[HOST]->TRANSACTIONALID=[ID]->action=write