You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently the MetricsEventSource uses locks or interlocked instructions to synchronize against multiple threads recording data points and this introduces contention. The goal is to eliminate the overhead of this contention as much as possible, ideally to the point that multi-threaded and single-threaded scenarios look identical. Its likely that reducing contention will require increasing memory footprint, decreasing measurement precision, or both. Depending on the difference in behavior we can decide whether it is acceptable to make it the new default or if we need settings that folks opt into.
This issue shows the problem for counters. Some work was done in .NET 8 to improve it, but I think we are still wasting 7/8ths of each cache line so there is definite room to do better within the same memory footprint.
For histograms quick benchmarking with 10 threads each issuing 100M updates shows this performance:
no listener - 1 sec
dotnet-counters listening - 40 sec
dotnet-counters listening + histogram dimension added with current thread id - 6 secs
Adding a thread id dimension causes the aggregator to track unique histograms per thread which eliminates the lock contention revealing the sizable perf overhead. I'm thinking to resolve this we should move to per-thread histogram buckets, but we probably need some compensating changes to ensure memory usage doesn't go really high. Right now histograms are using a minimum of 32KB each for first level bucketing array and then I think each 2nd level array 2^14 entries (64KB) to maintain a < 1:10,000 error bound. We could split the first level 4096 entries into two levels of 64 entries each and lower the default error bounds to something like 1:100 which should bring the footprint down to a few KB per thread.
The text was updated successfully, but these errors were encountered:
noahfalk
changed the title
[Feature Request] Faster histogram aggregation for the MetricsEventSource
[Feature Request] Faster aggregation for the MetricsEventSource under concurrency
Jan 11, 2024
Currently the MetricsEventSource uses locks or interlocked instructions to synchronize against multiple threads recording data points and this introduces contention. The goal is to eliminate the overhead of this contention as much as possible, ideally to the point that multi-threaded and single-threaded scenarios look identical. Its likely that reducing contention will require increasing memory footprint, decreasing measurement precision, or both. Depending on the difference in behavior we can decide whether it is acceptable to make it the new default or if we need settings that folks opt into.
This issue shows the problem for counters. Some work was done in .NET 8 to improve it, but I think we are still wasting 7/8ths of each cache line so there is definite room to do better within the same memory footprint.
For histograms quick benchmarking with 10 threads each issuing 100M updates shows this performance:
no listener - 1 sec
dotnet-counters listening - 40 sec
dotnet-counters listening + histogram dimension added with current thread id - 6 secs
Adding a thread id dimension causes the aggregator to track unique histograms per thread which eliminates the lock contention revealing the sizable perf overhead. I'm thinking to resolve this we should move to per-thread histogram buckets, but we probably need some compensating changes to ensure memory usage doesn't go really high. Right now histograms are using a minimum of 32KB each for first level bucketing array and then I think each 2nd level array 2^14 entries (64KB) to maintain a < 1:10,000 error bound. We could split the first level 4096 entries into two levels of 64 entries each and lower the default error bounds to something like 1:100 which should bring the footprint down to a few KB per thread.
The text was updated successfully, but these errors were encountered: