publishOn vs subscribeOn in Project Reactor 3
It took me sometime to understand it, maybe because publishOn
is usually explained before subscribeOn
, here's a hopefully more simple layman explanation.
subscribeOn
means running the initial source emission e.g subscribe(), onSubscribe() and request()
on a specified scheduler worker (other thread), and also the same for any subsequent operations like for example onNext/onError/onComplete, map etc
and no matter the position of subscribeOn(), this behavior would happen
And if you didn't do any publishOn
in the fluent calls then that's it, everything would run on such thread.
But as soon as you call publishOn()
let's say in the middle, then any subsequent operator call will be run on the supplied scheduler worker to such publishOn()
.
here's an example
Consumer<Integer> consumer = s -> System.out.println(s + " : " + Thread.currentThread().getName());
Flux.range(1, 5)
.doOnNext(consumer)
.map(i -> {
System.out.println("Inside map the thread is " + Thread.currentThread().getName());
return i * 10;
})
.publishOn(Schedulers.newElastic("First_PublishOn()_thread"))
.doOnNext(consumer)
.publishOn(Schedulers.newElastic("Second_PublishOn()_thread"))
.doOnNext(consumer)
.subscribeOn(Schedulers.newElastic("subscribeOn_thread"))
.subscribe();
The result would be
1 : subscribeOn_thread-4
Inside map the thread is subscribeOn_thread-4
2 : subscribeOn_thread-4
Inside map the thread is subscribeOn_thread-4
10 : First_PublishOn()_thread-6
3 : subscribeOn_thread-4
Inside map the thread is subscribeOn_thread-4
20 : First_PublishOn()_thread-6
4 : subscribeOn_thread-4
10 : Second_PublishOn()_thread-5
30 : First_PublishOn()_thread-6
20 : Second_PublishOn()_thread-5
Inside map the thread is subscribeOn_thread-4
30 : Second_PublishOn()_thread-5
5 : subscribeOn_thread-4
40 : First_PublishOn()_thread-6
Inside map the thread is subscribeOn_thread-4
40 : Second_PublishOn()_thread-5
50 : First_PublishOn()_thread-6
50 : Second_PublishOn()_thread-5
As you can see the first doOnNext()
and the following map()
is running on the thread called subscribeOn_thread
, that happens till any publishOn()
called, then any subsequent call would run on the supplied scheduler to that publishOn()
and again this will happen for any subsequent call till anyone calls another publishOn()
.
Following is an excerpt from excellent blog post https://spring.io/blog/2019/12/13/flight-of-the-flux-3-hopping-threads-and-schedulers
publishOn
This is the basic operator you need when you want to hop threads. Incoming signals from its source are published on the given Scheduler, effectively switching threads to one of that scheduler’s workers.
This is valid for the onNext
, onComplete
and onError
signals. That is, signals that flow from an upstream source to a downstream subscriber.
So in essence, every processing step that appears below this operator will execute on the new Scheduler s, until another operator switches again (eg. another publishOn
).
Flux.fromIterable(firstListOfUrls) //contains A, B and C
.publishOn(Schedulers.boundedElastic())
.map(url -> blockingWebClient.get(url))
.subscribe(body -> System.out.println(Thread.currentThread().getName + " from first list, got " + body));
Flux.fromIterable(secondListOfUrls) //contains D and E
.publishOn(Schedulers.boundedElastic())
.map(url -> blockingWebClient.get(url))
.subscribe(body -> System.out.prinln(Thread.currentThread().getName + " from second list, got " + body));
Output
boundedElastic-1 from first list, got A
boundedElastic-2 from second list, got D
boundedElastic-1 from first list, got B
boundedElastic-2 from second list, got E
boundedElastic-1 from first list, got C
subscribeOn
This operator changes where the subscribe method is executed. And since the subscribe signal flows upward, it directly influences where the source Flux subscribes and starts generating data.
As a consequence, it can seem to act on the parts of the reactive chain of operators upward and downward (as long as there is no publishOn
thrown in the mix):
final Flux<String> fetchUrls(List<String> urls) {
return Flux.fromIterable(urls)
.map(url -> blockingWebClient.get(url));
}
// sample code:
fetchUrls(A, B, C)
.subscribeOn(Schedulers.boundedElastic())
.subscribe(body -> System.out.println(Thread.currentThread().getName + " from first list, got " + body));
fetchUrls(D, E)
.subscribeOn(Schedulers.boundedElastic())
.subscribe(body -> System.out.prinln(Thread.currentThread().getName + " from second list, got " + body));
Output
boundedElastic-1 from first list, got A
boundedElastic-2 from second list, got D
boundedElastic-1 from first list, got B
boundedElastic-2 from second list, got E
boundedElastic-1 from first list, got C
Here is a small documentation which i got:
publishOn applies in the same way as any other operator, in the middle of the subscriber chain. It takes signals from downstream and replays them upstream while executing the callback on a worker from the associated Scheduler. Consequently, it affects where the subsequent operators will execute (until another publishOn is chained in).
subscribeOn applies to the subscription process, when that backward chain is constructed. As a consequence, no matter where you place the subscribeOn in the chain, it always affects the context of the source emission. However, this does not affect the behavior of subsequent calls to publishOn. They still switch the execution context for the part of the chain after them.
and
publishOn forces the next operator (and possibly subsequent operators after the next one) to run on a different thread. Similarly, subscribeOn forces the previous operator (and possibly operators prior to the previous one) to run on a different thread.