Search code examples
apache-kafkaapache-kafka-connectksqldb

SQL connector to KSQLDB , how to do it in docker.yml?


I found myself struggling to get documentation updated on how to put SQL server DB to KSQLDB ? I would like to know what would be the easiest way to get the connection toward Azure SQL Server and KSQLDB. I would like to streams CDC tables and send it to an eventhub.

please find below my testing Dockerfile compose.yml:

version: '3.9' # Updated version to ensure compatibility
services:
  zookeeper:

    image: confluentinc/cp-zookeeper:7.4.0

    hostname: zookeeper

    container_name: zookeeper

    ports:

      - "2181:2181"

    environment:

      ZOOKEEPER_CLIENT_PORT: 2181

      ZOOKEEPER_TICK_TIME: 2000

  kafka:

    image: confluentinc/cp-kafka:7.4.0

    hostname: kafka

    container_name: kafka

    depends_on:

      - zookeeper

    ports:

      - "29092:29092"

    environment:

      KAFKA_BROKER_ID: 1

      KAFKA_ZOOKEEPER_CONNECT: 'zookeeper:2181'

      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT

      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092,PLAINTEXT_HOST://localhost:29092

      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1

      KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 0

      KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1

      KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1

  ksql-server:
    image: confluentinc/ksqldb-server:0.29.0
    hostname: ksqldb-server
    container_name: ksqldb-server
    ports:
      - "8088:8088"
    environment:
      KSQL_LISTENERS: http://0.0.0.0:8088
      KSQL_BOOTSTRAP_SERVERS: kafka:9092
      KSQL_CONNECT_BOOTSTRAP_SERVERS: kafka:9092
      KSQL_CONNECT_PLUGIN_PATH: "/usr/share/kafka/plugins"
      
      KSQL_CONNECT_URL: "jdbc:sqlserver://xxxxx.database.windows.net:4093;database=xxxxx;encrypt=true;trustServerCertificate=false;hostNameInCertificate=*.database.windows.net;ApplicationIntent=readonly" 
      KSQL_CONNECT_USER: "xxxx"
      KSQL_CONNECT_PASSWORD: "xxxxxx"
      KSQL_KSQL_INTERNAL_TOPIC_REPLICATION_FACTOR: 2

    volumes:
      - "C:/Program\ Files/sqljdbc_12.6.3.0_enu/sqljdbc_12.6/enu/jars:/usr/share/kafka/plugins/mssql"
    depends_on:
      - kafka  # KSQL server depends on the Kafka broker
      - zookeeper # KSQL server also depends on Zookeeper
      - maven  # Add a Maven service dependency

  maven:
    image: maven:3.9.8-eclipse-temurin
    entrypoint: ["sh", "-c", "mvn dependency:copy -DremoteRepositories=https://repo1.maven.org/ -DgroupId=com.microsoft.sqlserver -DartifactId=mssql-jdbc -Dversion=12.6.3.0 -Ddest=/tmp/mssql-jdbc/"]
    volumes:
      - "C:/Program\ Files/sqljdbc_12.6.3.0_enu/sqljdbc_12.6/enu/jars:/usr/share/kafka/plugins/mssql"
  ksqldb-cli:

    image: confluentinc/ksqldb-cli:0.29.0

    container_name: ksqldb-cli

    depends_on:

      - kafka

      - ksql-server


    entrypoint: /bin/sh

    tty: true

Could you please how to get the KSQLDB to connect to Azure SQL Server ? Could you describe how to do it step by step ? How then I use brokers to send to Azure Eventhub ?

-->SQL Server Connectors to my compose.yml docker file
-->Send message to KSQLDB to EventHub


Solution

  • ksqlDB can only query Kafka topics, not databases.

    If you want to send Azure SQL to a Kafka/EventHub topic, you'd use a tool like Debezium or JDBC Source Kafka Connector to connect to the database, and forward events to Kafka, where you would then create a Stream/Table in ksqlDB from those topics.

    The KSQL_CONNECT_URL variable refers to a Kafka Connect server, which is what Debezium is based on, not a database. https://docs.ksqldb.io/en/latest/concepts/connectors/#setup-connect-integration


    To simply stream CDC tables to EventHub, you only need Kafka Connect / Debezium, not ksqlDB