What is P99 latency?

It's 99th percentile. It means that 99% of the requests should be faster than given latency. In other words only 1% of the requests are allowed to be slower.


Imagine that you are collecting performance data of your service and the below table is the collection of results (the latency values are fictional to illustrate the idea).

Latency    Number of requests
1s         5
2s         5
3s         10
4s         40
5s         20
6s         15
7s         4
8s         1

The P99 latency of your service is 7s. Only 1% of the requests take longer than that. So, if you can decrease the P99 latency of your service, you increase its performance.


We can explain it through an analogy, if 100 students are running a race then 99 students should complete the race in "latency" time.