Search code examples

Flink 'timeWindow' operation not generating output for PopularPlacesFromKafka example file

I'm going through Flink tutorial materials from dataArtisans and for some reason when I get to the sample file PopularPlacesFromKafka.scala I don't get any output sent to stdout.

// find popular places
val popularSpots = rides
  // match ride to grid cell and event type (start or end)
  .map(new GridCellMatcher)
  // partition by cell id and event type
  .keyBy( k => k )
  // build sliding window
  .timeWindow(Time.minutes(15), Time.minutes(5))
  // count events in window
  .apply{ (key: (Int, Boolean), window, vals, out: Collector[(Int, Long, Boolean, Int)]) =>
    out.collect( (key._1, window.getEnd, key._2, vals.size) )

// print result on stdout

I've confirmed that data is being pulled from Kafka ok, and it seems to be something when it attempts to do the 'timeWindow' operation that I get no output. If I remove the 'timeWindow' operation I can see the 'keyBy' data being output. Is there something obvious that I'm missing?


  • In case anyone has this same issue this was my problem.

    My kafka topic had multiple partitions, but was producing all of the test data to a single partition (0), once I had >1 Kafka consumers, all of the consumers except for the one assigned to partition 0 aren't receiving any data, and thus not sending any watermarks down the operator chain - that causes the window functions to stop emitting data (that's also why it works fine with ProcessingTime in those situations). Here's a relevant JIRA about it: