Spring webFlux differrences when Netty vs Tomcat is used under the hood

Currently there are 2 basic concepts to handle parallel access to a web-server with various advantages and disadvantages:

  1. Blocking
  2. Non-Blocking

Blocking Web-Servers

The first concept of blocking, multi-threaded server has a finite set amount of threads in a pool. Every request will get assigned to specific thread and this thread will be assigned until the request has been fully served. This is basically the same as how a the checkout queues in a super market works, a customer at a time with possible parallel lines. In most circumstances a request in a web server will be cpu-idle for the majority of the time while processing the request. This is due the fact that it has to wait for I/O: read the socket, write to the db (which is also basically IO) and read the result and write to the socket. Additionally using/creating a bunch of threads is slow (context switching) and requires a lot of memory. Therefore this concept often does not use the hardware resources it has very efficiently and has a hard limit on how many clients can be served in parallel. This property is misused in so called starvation attacks, e.g. the slow loris, an attack where usually a single client can DOS a big multi-threaded web-server with little effort.

Summary

  • (+) simpler code
  • (-) hard limit of parallel clients
  • (-) requires more memory
  • (-) inefficient use of hardware for usual web-server work
  • (-) easy to DOS

Most "conventional" web server work that way, e.g. older tomcat, Apache Webserver, and everything Servlet older than 3 or 3.1 etc.

Non-Blocking Web-Servers

In contrast a non-blocking web-server can serve multiple clients with only a single thread. That is because it uses the non-blocking kernel I/O features. These are just kernel calls which immediately return and call back when something can be written or read, making the cpu free to do other work instead. Reusing our supermarket metaphor, this would be like, when a cashier needs his supervisor to solve a problem, he does not wait and block the whole lane, but starts to check out the next customer until the supervisor arrives and solves the problem of the first customer.

This is often done in an event loop or higher abstractions as green-threads or fibers. In essence such servers can't really process anything concurrently (of course you can have multiple non-blocking threads), but they are able to serve thousands of clients in parallel because the memory consumption will not scale as drastically as with the multi-thread concept (read: there is no hard limit on max parallel clients). Also there is no thread context-switching. The downside is, that non-blocking code is often more complex to read and write (e.g. callback-hell) and doesn't prefrom well in situations where a request does a lot of cpu-expensive work.

Summary

  • (-) more complex code
  • (-) performance worse with cpu intensive tasks
  • (+) uses resources much more efficiently as web server
  • (+) many more parallel clients with no hard-limit (except max memory)

Most modern "fast" web-servers and framework facilitate non-blocking concepts: Netty, Vert.x, Webflux, nginx, servlet 3.1+, Node, Go Webservers.

As a side note, looking at this benchmark page you will see that most of the fastest web-servers are usually non-blocking ones: https://www.techempower.com/benchmarks/


See also

  • https://stackoverflow.com/a/21155697/774398
  • https://www.quora.com/What-exactly-does-it-mean-for-a-web-server-to-be-blocking-versus-non-blocking

When using Servlet 2.5, Servlet containers will assign a request to a thread until that request has been fully processed.

When using Servlet 3.0 async processing, the server can dispatch the request processing in a separate thread pool while the request is being processed by the application. However, when it comes to I/O, work always happens on a server thread and it is always blocking. This means that a "slow client" can monopolize a server thread, since the server is blocked while reading/writing to that client with a poor network connection.

With Servlet 3.1, async I/O is allowed and in that case the "one request/thread" model isn't anymore. At any point a bit request processing can be scheduled on a different thread managed by the server.

Servlet 3.1+ containers offer all those possibilities with the Servlet API. It's up to the application to leverage async processing, or non-blocking I/O. In the case of non-blocking I/O, the paradigm change is important and it's really challenging to use.

With Spring WebFlux - Tomcat, Jetty and Netty don't have the exact same runtime model, but they all support reactive backpressure and non-blocking I/O.