Search code examples
espernesper

NEsper memory usage of "output" keyword


I have many EPL statements that output a period of time (1~24 hours), and following is my statement

"SELECT MessageID, VName, count(VName) as count FROM DDIEvent(MajorType=4).std:groupwin(VName).win:time(3 hour).win:length(10) group by VName having count(VName) >= 10 output last every 3 hour"

If there is no limit of the length window, my case will retain around 700K records in 3 hours.
And in above, my test case only have 100 different VName. For my understanding, there will have maximum 1000 records keep in memory at the same time, (100 * 10[length]) am i right?

But the memory usage of my application will keep growing until output to listener.
The memory usage almost the same as the sample without length window.
And after output to listener the memory significantly fall down.
Then, another cycle begin, memory grow slowly until 3 hour later.
I already check the document, do not find the memory related topic of the "output" keyword.
Does anyone knows what is the really root cause? And how to avoid? Or just my EPL's problem?

Thank you~


Solution

  • If your query removes the "MessageId" from the select clause, it becomes a regular aggregation query. You could do a "last(MessageId)" instead. Because "MessageId" is in there the rows that the engine delivers are a row for each event rather then a row for each aggregation group.