v2.3.0 hdr histograms, performance (and a very small breaking change)
This release is an effort to improve performance by introducing hdr histograms (forked from https://github.com/HdrHistogram/HdrHistogram_c) and optimising internal timer aggregation and histogram machinery.
Currently in production at Badoo we're seeing up to 5 million timers/sec (each with 10-20-30 tags) per instance.
And some heavily loaded reports (the very non-specific ones, that aggregate almost the entire stream) - are hitting 100% cpu mark. So this release aims to improve that.
Breaking change
- added 'timers_skipped_by_bloom' field to 'active' report. Breaking, since it was added 'in the middle', after 'timers_aggregated' field.
Release highlights
- histograms now use hdr_histogram-like machinery internally
- percentiles (at the end of histogram interval) become more coarse might shift slightly
- percentiles (at the start of histogram interval) become more precise
- histograms will use slightly more memory on average (if your workload is anything like ours)
- performance should improve in most cases
- it's now possible and feasible to have histograms with 1 microsecond resolution (which is nice if you measure some short on-cpu functions for example) - they'll use more cpu (~2x for 1us vs 1ms histograms).
- performance enhancements
- 'request' reports are now significantly faster (and use less memory) both in aggregation and selects (converted them to be very similar to timer reports internally). use case: aggregating stats from nginx (response codes, etc.)
- added bloom filters for individual timers in the packet (fast skip for timers that the report is definitely not interested in)
- coordinator thread now uses considerably less resources (5M timers/sec are transcoded into internal format using ~1 cpu core).