Skip to content

Conversation

@sparknack
Copy link
Contributor

@sparknack sparknack commented Dec 10, 2025

issue: #45486

Introduce row group batching to reduce cache cell granularity and improve
memory&disk efficiency. Previously, each parquet row group mapped 1:1 to a cache
cell. Now, up to kRowGroupsPerCell (4) row groups are merged into one cell.

This reduces the number of cache cells (and associated overhead) by ~4x while
maintaining the same data granularity for loading.

@sre-ci-robot sre-ci-robot added the size/L Denotes a PR that changes 100-499 lines. label Dec 10, 2025
@sre-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: sparknack
To complete the pull request process, please assign jiaoew1991 after the PR has been reviewed.
You can assign the PR to them by writing /assign @jiaoew1991 in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@mergify mergify bot added dco-passed DCO check passed. kind/enhancement Issues or changes related to enhancement labels Dec 10, 2025
@sre-ci-robot
Copy link
Contributor

[ci-v2-notice]
Notice: We are gradually rolling out the new ci-v2 system.

  • Legacy CI jobs remain unaffected, you can just ignore ci-v2 if you don't want to run it.
  • Additional "ci-v2/*" checkers will run for this PR to ensure the new ci-v2 system is working as expected.
  • For tests that exist in both v1 and v2, passing in either system is considered PASS.

To rerun ci-v2 checks, comment with:

  • /ci-rerun-code-check // for ci-v2/code-check
  • /ci-rerun-build // for ci-v2/build
  • /ci-rerun-ut-integration // for ci-v2/ut-integration
  • /ci-rerun-ut-go // for ci-v2/ut-go
  • /ci-rerun-ut-cpp // for ci-v2/ut-cpp
  • /ci-rerun-ut // for all ci-v2/ut-integration, ci-v2/ut-go, ci-v2/ut-cpp
  • /ci-rerun-e2e-arm // for ci-v2/e2e-arm [master branch only]
  • /ci-rerun-e2e-default // for ci-v2/e2e-default [master branch only]

If you have any questions or requests, please contact @zhikunyao.

@codecov
Copy link

codecov bot commented Dec 10, 2025

Codecov Report

❌ Patch coverage is 61.90476% with 88 lines in your changes missing coverage. Please review.
✅ Project coverage is 75.93%. Comparing base (224a794) to head (356b73e).
⚠️ Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
...re/storagev2translator/ManifestGroupTranslator.cpp 0.00% 84 Missing ⚠️
...gcore/storagev2translator/GroupChunkTranslator.cpp 97.43% 2 Missing ⚠️
...core/storagev2translator/ManifestGroupTranslator.h 0.00% 2 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #46249      +/-   ##
==========================================
- Coverage   75.93%   75.93%   -0.01%     
==========================================
  Files        1898     1898              
  Lines      297301   297412     +111     
==========================================
+ Hits       225757   225834      +77     
- Misses      64047    64085      +38     
+ Partials     7497     7493       -4     
Components Coverage Δ
Client 78.18% <ø> (ø)
Core 82.79% <61.90%> (-0.03%) ⬇️
Go 73.89% <ø> (+<0.01%) ⬆️
Files with missing lines Coverage Δ
...core/src/segcore/storagev2translator/GroupCTMeta.h 100.00% <100.00%> (ø)
...segcore/storagev2translator/GroupChunkTranslator.h 100.00% <100.00%> (ø)
...e/storagev2translator/GroupChunkTranslatorTest.cpp 99.55% <100.00%> (+0.06%) ⬆️
...gcore/storagev2translator/GroupChunkTranslator.cpp 97.81% <97.43%> (-0.05%) ⬇️
...core/storagev2translator/ManifestGroupTranslator.h 0.00% <0.00%> (ø)
...re/storagev2translator/ManifestGroupTranslator.cpp 0.00% <0.00%> (ø)

... and 23 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@sparknack sparknack force-pushed the one-cell-multi-rg branch 3 times, most recently from 2607bda to 667c490 Compare December 11, 2025 07:03
@sre-ci-robot sre-ci-robot added size/XL Denotes a PR that changes 500-999 lines. and removed size/L Denotes a PR that changes 100-499 lines. labels Dec 11, 2025
@mergify mergify bot added the ci-passed label Dec 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-passed dco-passed DCO check passed. kind/enhancement Issues or changes related to enhancement size/XL Denotes a PR that changes 500-999 lines.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants