[Query Engine] Improve DistinctCountSmartHLL for dictionary-encoded columns #17411

praveenc7 · 2025-12-22T21:25:24Z

Summary

For dictionary-encoded columns, DISTINCT_COUNT_SMART_HLL currently uses a RoaringBitmap to deduplicate dictionary IDs before feeding values into HLL. While efficient for low cardinality, this approach becomes CPU-intensive for high cardinality (hundreds of thousands to millions of distinct values), where RoaringBitmap insertions dominate query execution time and negate the benefits of HLL.

Proposal

Introduce a cardinality-aware execution path for DISTINCT_COUNT_HLL:

Low cardinality → continue using RoaringBitmap (exact deduplication, memory-efficient)
High cardinality → bypass RoaringBitmap and update HLL directly

Observed improvements

Reduces server-side CPU time by ~4x - 10× for high-cardinality queries (observed improvements from ~8s → ~700ms in prod benchmarks).

Testing Done

Added JMH benchmark covering:

This JMH benchmark isolates server-side aggregation cost for the DistinctCountHLLAggregationFunction under controlled parameters: Each variation was run for 10 minutes

recordCount: {100K, 500K, 1M, 5M, 10M, 25M}
cardinalityRatioPercent: {1, 10, 30, 50, 80, 100} → Creates a record with configured cardinality
useRoaringBitMap/HLL -> Controls on to run the test with useRoaringBitMap or HLL

DictIds are pre-generated so benchmark timing includes only aggregation, not data generation.

Sample plots :

Flame graph after optimization : Aggregate doesn't dominate CPU

Benchmark Results (Average Latency, ms/op)

Record Count = 100,000

Cardinality	RoaringBitmap	Direct HLL
1,000	0.6	0.80
10,000	0.71	0.87
30,000	0.89	0.90
50,000	1.05	0.96
80,000	1.79	1.00
100,000	1.91	1.05

Record Count = 500,000

Cardinality	RoaringBitmap	Direct HLL
5,000	1.45	2.85
50,000	2.36	2.92
150,000	5.53	3.16
250,000	7.26	3.18
400,000	9.59	3.18
500,000	10.69	3.17

Record Count = 1,000,000

Cardinality	RoaringBitmap	Direct HLL
10,000	2.53	5.36
100,000	6.69	5.44
300,000	13.12	5.80
500,000	15.92	5.78
800,000	19.84	5.78
1,000,000	22.11	5.71

Record Count = 5,000,000

Cardinality	RoaringBitmap	Direct HLL
50,000	11.51	25.12
500,000	53.62	25.29
1,500,000	75.60	26.13
2,500,000	92.91	25.53
4,000,000	113.21	25.24
5,000,000	129.34	25.79

Record Count = 10,000,000

Cardinality	RoaringBitmap	Direct HLL
100,000	52.98	50.64
1,000,000	117.68	50.61
3,000,000	161.56	50.08
5,000,000	206.71	51.14
8,000,000	248.77	50.01
10,000,000	278.78	50.37

Record Count = 25,000,000

Cardinality	RoaringBitmap	Direct HLL
250,000	199.06	125.82
2,500,000	348.39	126.40
7,500,000	466.14	124.74
12,500,000	555.77	124.35
20,000,000	679.43	124.99

Recommendation:

Based on the micro-benchmark results across record counts and cardinalities, 100K distinct values is a good default threshold to start with for switching away from the RoaringBitmap path. At this scale, RoaringBitmap remains efficient for low-cardinality cases, while higher cardinalities already show clear benefits from using direct HLL updates. This threshold provides a safe balance between preserving deduplication benefits for low cardinality and avoiding excessive bitmap maintenance cost for high-cardinality workloads

codecov-commenter · 2025-12-22T22:11:00Z

Codecov Report

❌ Patch coverage is 42.02899% with 40 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.28%. Comparing base (72be4e6) to head (fd10acb).
⚠️ Report is 35 commits behind head on master.

Files with missing lines	Patch %	Lines
...seDistinctCountSmartSketchAggregationFunction.java	38.23%	17 Missing and 4 partials ⚠️
...tion/DistinctCountSmartHLLAggregationFunction.java	45.71%	16 Missing and 3 partials ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##             master   #17411      +/-   ##
============================================
+ Coverage     63.26%   63.28%   +0.02%     
  Complexity     1474     1474              
============================================
  Files          3152     3162      +10     
  Lines        187881   188725     +844     
  Branches      28765    28881     +116     
============================================
+ Hits         118855   119433     +578     
- Misses        59810    60039     +229     
- Partials       9216     9253      +37

Flag	Coverage Δ
custom-integration1	`100.00% <ø> (ø)`
integration	`100.00% <ø> (ø)`
integration1	`100.00% <ø> (ø)`
integration2	`0.00% <ø> (ø)`
java-11	`63.24% <42.02%> (+0.03%)`	⬆️
java-21	`63.22% <42.02%> (-0.02%)`	⬇️
temurin	`63.28% <42.02%> (+0.02%)`	⬆️
unittests	`63.28% <42.02%> (+0.02%)`	⬆️
unittests1	`55.61% <42.02%> (-0.06%)`	⬇️
unittests2	`34.00% <0.00%> (+0.10%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copilot

Pull request overview

This PR introduces an adaptive cardinality-aware execution path for DISTINCT_COUNT_SMART_HLL aggregation on dictionary-encoded columns. For high-cardinality scenarios (>100K distinct values), bypassing RoaringBitmap and directly updating HLL reduces server-side CPU time by 4x-10x. A new dictThreshold parameter (default: 100K) controls when to convert from RoaringBitmap to HLL during aggregation instead of waiting until finalization.

Key changes:

Added adaptive conversion logic that switches from RoaringBitmap to HLL when cardinality exceeds threshold
Introduced dictThreshold parameter with 100K default based on benchmark results
Implemented early conversion checks during aggregation for both group-by and non-group-by queries

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
`DistinctCountSmartHLLAggregationFunction.java`	Added `_dictIdCardinalityThreshold` field, parameter parsing, and adaptive conversion logic for non-group-by aggregation
`BaseDistinctCountSmartSketchAggregationFunction.java`	Implemented group-by adaptive conversion with modified group tracking and cardinality checks
`DistinctCountSmartHLLAggregationFunctionTest.java`	Added unit tests for parameter parsing, HLL operations, and adaptive conversion behavior
`BenchmarkDistinctCountHLLThreshold.java`	Added JMH benchmark to measure performance across different cardinalities and record counts
`pinot-perf/pom.xml`	Registered new benchmark class in Maven build configuration

...e/pinot/core/query/aggregation/function/BaseDistinctCountSmartSketchAggregationFunction.java

Copilot · 2026-01-06T07:06:05Z

...g/apache/pinot/core/query/aggregation/function/DistinctCountSmartHLLAggregationFunction.java

+ * - dictThreshold: Threshold for dictionary-encoded columns to trigger early conversion from RoaringBitmap to HLL
+ *                  during aggregation. 100_000 by default. Set to Integer.MAX_VALUE to disable and convert only
+ *                  at finalization.


The documentation should clarify that non-positive values (≤0) are also treated as disabled and converted to Integer.MAX_VALUE, consistent with the implementation in the Parameters class where values ≤0 are set to Integer.MAX_VALUE.

Copilot · 2026-01-06T07:06:05Z

...ache/pinot/core/query/aggregation/function/DistinctCountSmartHLLAggregationFunctionTest.java

+    for (int i = 0; i < 1000; i++) {
+      hll.offer(i);
+    }
+    Long cardinality = Long.valueOf(function.extractFinalResult(hll));


Replace Long.valueOf() with direct cast (Long) since extractFinalResult() already returns a Long object. Using Long.valueOf() on an object is unnecessary and could cause a NullPointerException if the result is null.

Suggested change

Long cardinality = Long.valueOf(function.extractFinalResult(hll));

Long cardinality = (Long) function.extractFinalResult(hll);

function.extractFinalResult(hll) returns Int actually hence Long.valueOf() is used

xiangfu0

lgtm otherwise

xiangfu0 · 2026-01-06T07:09:25Z

...g/apache/pinot/core/query/aggregation/function/DistinctCountSmartHLLAggregationFunction.java

  @Override
  public void aggregate(int length, AggregationResultHolder aggregationResultHolder,
-      Map<ExpressionContext, BlockValSet> blockValSetMap) {
+                        Map<ExpressionContext, BlockValSet> blockValSetMap) {


revert this

xiangfu0 · 2026-01-06T07:11:08Z

also please add a release note for this enhancement

xiangfu0 · 2026-01-06T08:19:23Z

Also please fix the tests.

praveenc7 · 2026-01-06T21:21:53Z

also please add a release note for this enhancement

Sure will add one

somandal

overall lgtm!

One question, how was the default of 100_000 chosen as the threshold? Is that a desirable default for all queries going forward? Are there any scenarios where using the Integer.MAX_VALUE may be preferred? It'll be good to call this out in documentation as a guide for users

Also, can you take care of updating OSS documentation to reflect these changes? Thanks!

praveenc7 · 2026-01-07T06:27:43Z

overall lgtm!

One question, how was the default of 100_000 chosen as the threshold? Is that a desirable default for all queries going forward? Are there any scenarios where using the Integer.MAX_VALUE may be preferred? It'll be good to call this out in documentation as a guide for users

Also, can you take care of updating OSS documentation to reflect these changes? Thanks!

Thanks for the review @somandal

On the 100,000 threshold?
I chose 100K based on a combination of (a) micro-benchmarking across multiple recordCount sizes and cardinality ratios, and (b) validating behavior on the perf cluster with representative query patterns. The crossover point varies a bit by record count/cardinality ratio, but ~100K is a consistent “safe default” where the direct HLL path stops regressing and starts outperforming the RoaringBitmap-heavy path for workloads. (I included representative benchmark tables in the Testing section)

When would Integer.MAX_VALUE be preferred?
With the adaptive threshold in place, I don’t see a strong need to use Integer.MAX_VALUE in normal scenarios. The intent of the change is precisely to handle both low- and high-cardinality cases automatically. Added it as a guardrail if something odd shows up in production

Updating docs based on this optimization : apache/pinot#17411

praveenc7 · 2026-01-07T06:40:55Z

Docs PR : pinot-contrib/pinot-docs#459

praveenc7 added 2 commits December 22, 2025 11:54

[Query Engine] Improve DistinctCountHLL

a06722c

style check

b0d3f88

praveenc7 force-pushed the distinct_hll branch from 547f4b8 to 713700b Compare December 23, 2025 21:02

smart hll

bd97325

praveenc7 force-pushed the distinct_hll branch from 713700b to bd97325 Compare December 23, 2025 22:40

praveenc7 marked this pull request as ready for review December 23, 2025 22:45

xiangfu0 requested review from Copilot and xiangfu0 January 6, 2026 07:05

xiangfu0 added enhancement functions labels Jan 6, 2026

Copilot AI reviewed Jan 6, 2026

View reviewed changes

xiangfu0 approved these changes Jan 6, 2026

View reviewed changes

review comments

fd10acb

somandal approved these changes Jan 7, 2026

View reviewed changes

praveenc7 added a commit to praveenc7/pinot-docs that referenced this pull request Jan 7, 2026

Update distinctcountsmarthll.md

c326d80

Updating docs based on this optimization : apache/pinot#17411

praveenc7 mentioned this pull request Jan 7, 2026

Update distinctcountsmarthll.md pinot-contrib/pinot-docs#459

Open

	Long cardinality = Long.valueOf(function.extractFinalResult(hll));
	Long cardinality = (Long) function.extractFinalResult(hll);

[Query Engine] Improve DistinctCountSmartHLL for dictionary-encoded columns #17411

Are you sure you want to change the base?

[Query Engine] Improve DistinctCountSmartHLL for dictionary-encoded columns #17411

Conversation

praveenc7 commented Dec 22, 2025

Summary

Proposal

Observed improvements

Testing Done

Benchmark Results (Average Latency, ms/op)

Record Count = 100,000

Record Count = 500,000

Record Count = 1,000,000

Record Count = 5,000,000

Record Count = 10,000,000

Record Count = 25,000,000

Recommendation:

Uh oh!

codecov-commenter commented Dec 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Copilot AI Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

praveenc7 Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

xiangfu0 left a comment

Choose a reason for hiding this comment

Uh oh!

xiangfu0 Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

xiangfu0 commented Jan 6, 2026

Uh oh!

xiangfu0 commented Jan 6, 2026

Uh oh!

praveenc7 commented Jan 6, 2026

Uh oh!

somandal left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

praveenc7 commented Jan 7, 2026

Uh oh!

praveenc7 commented Jan 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codecov-commenter commented Dec 22, 2025 •

edited

Loading

somandal left a comment •

edited

Loading