I have a data source that produces point at a potentially high rate, and I'd like to perform a possibly time-consuming operation on each point; but I would also like the system to degrade gracefully when it becomes overloaded, by dropping excess data points.
As far as I can tell, using a gen_event will never skip events. Conceptually, what I would like the gen_event to do is to drop all but the latest pending events before running the handlers again.
Is there a way to do this with standard OTP ? or is there a good reason why I should not handle things that way ?
So far the best I have is using a gen_server and relying on the timeout to trigger the expensive events:
-behaviour(gen_server).
init() ->
{ok, Pid} = gen_event:start_link(),
{ok, {Pid, none}}.
handle_call({add, H, A},_From,{Pid,Data}) ->
{reply, gen_event:add_handler(Pid,H,A), {Pid,Data}}.
handle_cast(Data,{Pid,_OldData}) ->
{noreply, {Pid,Data,0}}. % set timeout to 0
handle_info(timeout, {Pid,Data}) ->
gen_event:sync_notify(Pid,Data),
{noreply, {Pid,Data}}.
Is this approach correct ? (esp. with respect to supervision ? )
Is there a way to do this with standard OTP ?
No.
is there a good reason why I should not handle things that way ?
No, timing out early can increase the performance of the entire system. Read about how here.
Is this approach correct ? (esp. with respect to supervision ? )
No idea, you haven't provided the supervision code.
As a bit of extra information to your first question:
If you can use 3rd party libraries outside of OTP, there are a few out there that can add preemptive timeouts, which is what you are describing.
There are two that I am familiar with the first is dispcount, and the second is chick (I'm the author of chick, i'll try not to advertise the project here).
Dispcount works really good for single resources that only have a limited number of jobs that can be run at the same time and does no queuing. you can read about it here (warning lots of really interesting information!).
Dispcount didn't work for me because i would have had to spawn 4000+ pools of processes to handle the amount of different queues inside of my app. I wrote chick because I needed a way to dynamically increase and decrease my queue length, as well as being able to queue up requests and deny others, without having to spawn 4000+ pools of processes.
If I were you I would try out discount first (as most solutions do not need chick), and then if you need something a bit more dynamic then a pool that can respond to a certain number of requests try out chick.