Search code examples
apache-kafkanetflix

What are the messages in Apache Kafka?


I was going through a tutorial on Apache Kafka. It said that netflix has 4000 brokers across 36 clusters processing over 700 billions messages per day.

What does these messages can refer to when we talk in the context of Netflix?


Solution

  • The core abstraction Kafka provides for a stream of records is known as topic. You can imagine topics as the tables in a database. A database (Kafka) can have multiple tables (topics). Like in databases, a topic can have any kind of records depending on the usecase.

    For Netflix particularly, we might have a topic users that contains the users of the platform:

    {"userId":"1", "firstName":"Giorgos", "lastName":"Myrianthous"}
    

    or a topic movies that contains movies' details:

    {"movieID":"1", "title":"Titanic", "genre":"drama", "rating":"5"}
    

    Other topics might also include data that serve internal analytical/business intelligence tools, machine learning algorithms (like recommendation engines) or alerting mechanisms.

    Data within a topic can be represented by various types such as String, JSON or Avro.