Search code examples
aggregaterx-javaobservabledispatch

Aggregate resource requests & dispatch responses to each subscriber


I'm fairly new to RxJava and struggling with an use case that seems quite common to me :

Gather multiple requests from different parts of the application, aggregate them, make a single resource call and dispatch the results to each subscriber.

I've tried a lot of different approaches, using subjects, connectable observables, deferred observables... none did the trick so far.

I was quite optimistic about this approach but turns out it fails just like the others :

    //(...)
    static HashMap<String, String> requests = new HashMap<>();
    //(...)

    @Test
    public void myTest() throws InterruptedException {
        TestScheduler scheduler = new TestScheduler();
        Observable<String> interval = Observable.interval(10, TimeUnit.MILLISECONDS, scheduler)
                .doOnSubscribe(() -> System.out.println("new subscriber!"))
                .doOnUnsubscribe(() -> System.out.println("unsubscribed"))
                .filter(l -> !requests.isEmpty())
                .doOnNext(aLong -> System.out.println(requests.size() + " requests to send"))
                .flatMap(aLong -> {
                    System.out.println("requests " + requests);
                    return Observable.from(requests.keySet()).take(10).distinct().toList();
                })
                .doOnNext(strings -> System.out.println("calling aggregate for " + strings + " (from " + requests + ")"))
                .flatMap(Observable::from)
                .doOnNext(s -> {
                    System.out.println("----");
                    System.out.println("removing " + s);
                    requests.remove(s);
                })
                .doOnNext(s -> System.out.println("remaining " + requests));

        TestSubscriber<String> ts1 = new TestSubscriber<>();
        TestSubscriber<String> ts2 = new TestSubscriber<>();
        TestSubscriber<String> ts3 = new TestSubscriber<>();
        TestSubscriber<String> ts4 = new TestSubscriber<>();

        Observable<String> defer = buildObservable(interval, "1");
        defer.subscribe(ts1);
        Observable<String> defer2 = buildObservable(interval, "2");
        defer2.subscribe(ts2);
        Observable<String> defer3 = buildObservable(interval, "3");
        defer3.subscribe(ts3);
        scheduler.advanceTimeBy(200, TimeUnit.MILLISECONDS);
        Observable<String> defer4 = buildObservable(interval, "4");
        defer4.subscribe(ts4);

        scheduler.advanceTimeBy(100, TimeUnit.MILLISECONDS);
        ts1.awaitTerminalEvent(1, TimeUnit.SECONDS);
        ts2.awaitTerminalEvent(1, TimeUnit.SECONDS);
        ts3.awaitTerminalEvent(1, TimeUnit.SECONDS);
        ts4.awaitTerminalEvent(1, TimeUnit.SECONDS);

        ts1.assertValue("1");
        ts2.assertValue("2"); //fails (test stops here)
        ts3.assertValue("3"); //fails
        ts4.assertValue("4"); //fails


    }

    public Observable<String> buildObservable(Observable<String> interval, String key) {

        return  Observable.defer(() -> {
                            System.out.printf("creating observable for key " + key);
                            return Observable.create(subscriber -> {
                                requests.put(key, "xxx");
                                interval.doOnNext(s -> System.out.println("filtering : key/val  " + key + "/" + s))
                                        .filter(s1 -> s1.equals(key))
                                        .doOnError(subscriber::onError)
                                        .subscribe(s -> {
                                            System.out.println("intern " + s);
                                            subscriber.onNext(s);
                                            subscriber.onCompleted();
                                            subscriber.unsubscribe();
                                        });
                            });
                        }
                )
                ;
    }

Output :

creating observable for key 1new subscriber!
creating observable for key 2new subscriber!
creating observable for key 3new subscriber!
3 requests to send
requests {3=xxx, 2=xxx, 1=xxx}
calling aggregate for [3, 2, 1] (from {3=xxx, 2=xxx, 1=xxx})
----
removing 3
remaining {2=xxx, 1=xxx}
filtering : key/val  1/3
----
removing 2
remaining {1=xxx}
filtering : key/val  1/2
----
removing 1
remaining {}
filtering : key/val  1/1
intern 1
creating observable for key 4new subscriber!
1 requests to send
requests {4=xxx}
calling aggregate for [4] (from {4=xxx})
----
removing 4
remaining {}
filtering : key/val  1/4

The test fails at the second assertion (ts2 not receiving "2") Turns out the pseudo-aggregation works as expected, but the values are not dispatched to the corresponding subscribers (only the first subscriber receives it)

Any idea why?

Also, I feel like I'm missing the obvious here. If you think of a better approach, I'm more than willing to hear about it.

EDIT : Adding some context regarding what I want to achieve.

I have a REST API exposing data via multiple endpoints (eg. user/{userid}). This API also makes it possible to aggregate requests (eg. user/user1 & user/user2) and get the corresponding data in one single http request instead of two.

My goal is to be able to automatically aggregate the requests made from different parts of my application in a given time frame (say 10ms) with a max batch size (say 10), make an aggregate http request, then dispatch the results to the corresponding subscribers.

Something like this :

// NOTE: those calls can be fired from anywhere in the app, and randomly combined. The timing and order is completely unpredictable

//ts : 0ms
api.call(userProfileRequest1).subscribe(this::show); 
api.call(userProfileRequest2).subscribe(this::show);

//--> after 10ms, should fire one single http aggregate request with those 2 calls, map the response items & send them to the corresponding subscribers (that will show the right user profile)

//ts : 20ms
api.call(userProfileRequest3).subscribe(this::show); 
api.call(userProfileRequest4).subscribe(this::show);
api.call(userProfileRequest5).subscribe(this::show); 
api.call(userProfileRequest6).subscribe(this::show);
api.call(userProfileRequest7).subscribe(this::show); 
api.call(userProfileRequest8).subscribe(this::show);
api.call(userProfileRequest9).subscribe(this::show); 
api.call(userProfileRequest10).subscribe(this::show);
api.call(userProfileRequest11).subscribe(this::show); 
api.call(userProfileRequest12).subscribe(this::show);

//--> should fire a single http aggregate request RIGHT AWAY (we hit the max batch size) with the 10 items, map the response items & send them to the corresponding subscribers (that will show the right user profile)   

The test code I wrote (with just strings) and pasted at the top of this question is meant to be a proof of concept for my final implementation.


Solution

  • Your Observable is not well constructed

     public Observable<String> buildObservable(Observable<String> interval, String key) {
    
        return interval.doOnSubscribe(() -> System.out.printf("creating observable for key " + key))
                       .doOnSubscribe(() -> requests.put(key, "xxx"))
                       .doOnNext(s -> System.out.println("filtering : key/val  " + key + "/" + s))
                       .filter(s1 -> s1.equals(key));
            }
    

    When you subsribe in a subscriber : it's offen a bad design.

    I'm not shure to understand what you want to achieve, but I think my code should be pretty close to yours.

    Please note that, for all side effects, I use doMethods (like doOnNext, doOnSubscribe) to show I explicitly show that I want to do a side effect.

    I replace your defer call by returning directly the interval : as you want to emit all interval events in your custom observable build in your defer call, returning the interval observable is better.

    Please note, that you filtering your interval Observable :

    Observable<String> interval = Observable.interval(10, TimeUnit.MILLISECONDS, scheduler)
                .filter(l -> !requests.isEmpty()).
                // ... 
    

    So, as soon you'll put something into requests map, interval will stop emmiting.

    I don't understand what you wants to achieve with the request map, but please note that you may want to avoid side effects, and updating this map is clearly a side effect.

    Update regarding comments

    You may want to use the buffer operator to aggregate request, and then perform request in a bulk way :

        PublishSubject<String> subject = PublishSubject.create();
    
    
        TestScheduler scheduler = new TestScheduler();
    
        Observable<Pair> broker = subject.buffer(100, TimeUnit.MILLISECONDS, 10, scheduler)
                                         .flatMapIterable(list -> list) // you can bulk calls here
                                         .flatMap(id -> Observable.fromCallable(() -> api.call(id)).map(response -> Pair.of(id, response)));
    
        TestSubscriber<Object> ts1 = new TestSubscriber<>();
        TestSubscriber<Object> ts2 = new TestSubscriber<>();
        TestSubscriber<Object> ts3 = new TestSubscriber<>();
        TestSubscriber<Object> ts4 = new TestSubscriber<>();
    
        broker.filter(pair -> pair.id.equals("1")).take(1).map(pair -> pair.response).subscribe(ts1);
        broker.filter(pair -> pair.id.equals("2")).take(1).map(pair -> pair.response).subscribe(ts2);
        broker.filter(pair -> pair.id.equals("3")).take(1).map(pair -> pair.response).subscribe(ts3);
        broker.filter(pair -> pair.id.equals("4")).take(1).map(pair -> pair.response).subscribe(ts4);
    
        subject.onNext("1");
        subject.onNext("2");
        subject.onNext("3");
    
        scheduler.advanceTimeBy(1, TimeUnit.SECONDS);
    
        ts1.assertValue("resp1");
        ts2.assertValue("resp2");
        ts3.assertValue("resp3");
        ts4.assertNotCompleted();
    
        subject.onNext("4");
        scheduler.advanceTimeBy(1, TimeUnit.SECONDS);
        ts4.assertValue("resp4");
        ts4.assertCompleted();
    

    If you want to perform network request collapsin, you may want to check Hystrix : https://github.com/Netflix/Hystrix