The test actually compares p95 & p99, I just didn’t present these series on charts. You can find the percentiles in raw output.
The reason for focusing on max. pause only was mainly to show that its duration is O(liveObjectCount) for .NET, and is O(1) for Go. In other words, does it scale similarly to your working set size or not?
As for other kinds of pauses & possible errors, some of these machines were having lots of CPU cores. So it’s highly unlikely that a prolonged pause happening on all application threads simultaneously is caused by OS / something other than GC — assuming no other CPU-consuming tasks are running. Note that we assume it’s a GC pause only if all the threads are stopped — by capturing pause coverage (a set of intervals recognized as pauses) in every thread and intersecting them all.