Search code examples
pythonapache-kafkakafka-consumer-apikafka-producer-api

How to produce audio (.wav) to kafka in one message


In order to apply real time speech analytics using big data technologies, I'm trying to use at beginning KAFKA. So at first I convert .wav file to bytes using WAVIO API and then send messages containing [data(type of nparray), rate (integer) and sampwidth (integer)] to kafka after these messages will be consumed by a Consumer who will convert them to .wav file again.

The problem is how can I send and receive these [data, rate, sampwidth] to and from kafka in one message (each message represent .wav file)?

For the Producer:

    producer = KafkaProducer(bootstrap_servers='localhost:9092')
    x = wav2bytes("bush_read") # return tuple containing(data, rate, sampwidth)
    #here I'm sending 3 messages
    producer.send("TestTopic", key=b'data', value=b'%s' % (x[0])) # data -> nparray
    producer.send("TestTopic", key=b'rate', value=b'%d' % (x[1]))   # rate -> int
    producer.send("TestTopic", key=b'sampwidth', value=b'%d' % (x[2]))  #sampwidth -> int
    send("TestTopic","bush_read")

For the consumer:

    for message in consumer:
        msg = message     # I want somthing like this
        file = bytes2wav("name", msg.data, msg.rate, msg.sampwidth )

Solution

  • You can send it as json (or any other serializion) if you want, create a json like

    {'data' : data, 'rate': rate, 'sampwidth': sampwidth}

    and you can deserializion it in the consumer