Parquet decode: Skip up to first_row for non-lists #20835

pmattione-nvidia · 2025-12-10T23:28:26Z

This changes the parquet decode to skip decoding work up to first_row for non-lists. We were already skipping ahead for lists, which was easy to do because there is a lot of list preprocessing that tells us exactly how far to skip ahead.

We don't have that for non-lists with nulls, so we have to do some work. Specifically, if we skip N rows, we don't know how many valids we are skipping so we have to loop through and figure that out. This loop (with the call to skip_validity_and_row_indices_nonlist()) is basically the same as the normal decode processing with all of the extra superfluous code/work stripped out. We then can skip ahead with the dictionary and bool stream decode as before.

This does not immediately improve performance, as the bottlenecks are elsewhere. However, skipping ahead simplifies the rest of the code, removing weird corner cases (see removed if-checks) in the code that were necessary without the skipping. This will make further simplifications and optimizations to the code easier later on.

Checklist

I am familiar with the Contributing Guidelines.
New or existing tests cover these changes.
The documentation is up to date with these changes.

pmattione-nvidia · 2025-12-10T23:30:12Z

cpp/src/io/parquet/decode_fixed.cu

    }();

-    // dst_pos may be negative (values before first_row) for non-lists.
-    if (dst_pos >= 0) {


dst_pos can no longer be negative (it was negative before because we subtract first_row but we start there now). This and the following section look like a bunch of changes, but it's just white-space from removing the if.

Thanks for the hint. Hiding whitespaces def helps here

pmattione-nvidia · 2025-12-10T23:30:38Z

cpp/src/io/parquet/decode_fixed.cu

    }();

-    // dst_pos may be negative (values before first_row) for non-lists.
-    if (dst_pos >= 0) {


No changes in this whole section except removing the if and whitespace changes from shifting the code in.

I was going to say: what the heck are all these changes? Then I realized it was just indentation. :)

pmattione-nvidia · 2025-12-10T23:32:11Z

cpp/src/io/parquet/decode_fixed.cu

+ * @return Maximum depth valid count after processing
+ */
+template <int decode_block_size, typename level_t, bool is_nested, int rolling_buf_size>
+__device__ int skip_validity_and_row_indices_nonlist(


This is just a gutted version of update_validity_and_row_indices_flat(), where all of the unnecessary code has been removed (e.g. filling validity flags).

pmattione-nvidia · 2025-12-10T23:32:35Z

cpp/src/io/parquet/decode_fixed.cu


-      // sum null counts. we have to do it this way instead of just incrementing by (value_count -
-      // valid_count) because valid_count also includes rows that potentially start before our row
-      // bounds. if we could come up with a way to clean that up, we could remove this and just


this change "cleans that up"

pmattione-nvidia · 2025-12-10T23:32:45Z

cpp/src/io/parquet/decode_fixed.cu


-    // sum null counts. we have to do it this way instead of just incrementing by (value_count -
-    // valid_count) because valid_count also includes rows that potentially start before our row
-    // bounds. if we could come up with a way to clean that up, we could remove this and just


this change "cleans that up"

pmattione-nvidia · 2025-12-10T23:33:12Z

cpp/src/io/parquet/page_string_utils.cuh

    }();

    // lookup input string pointer & length. store length.
-    bool const in_range                             = (thread_pos < target_pos) && (dst_pos >= 0);


dst_pos can no longer be negative, same as in the other decode functions

pmattione-nvidia · 2025-12-10T23:33:58Z

cpp/src/io/parquet/decode_fixed.cu

-    if (skipped_leaf_values > 0) {
-      if (should_process_nulls) {
-        skip_decode<rolling_buf_size>(def_decoder, skipped_leaf_values, t);
+  // Skip ahead in the decoding so that we don't repeat work


the actual skipping. the list skipping is refactored slightly to share common code with non-lists (skip_bools()).

pmattione-nvidia · 2025-12-10T23:34:41Z

cpp/src/io/parquet/decode_fixed.cu

+
+    // Non-lists
+    int const first_row = s->first_row;
+    if (first_row <= 0) { return; }


the non-list skipping. Note that if we don't process nulls we know exactly how far to skip and no looping is needed.

cpp/src/io/parquet/decode_fixed.cu

mhaseeb123

Initial pass and so far so good. One small question

cpp/src/io/parquet/decode_fixed.cu

mhaseeb123

Small comments; looks good otherwise. Waiting for all tests to pass and a final pass before approve

…ia/cudf into chunked_skip_flat

nvdbaranec · 2025-12-12T19:02:59Z

cpp/src/io/parquet/decode_fixed.cu

    }();

-    // dst_pos may be negative (values before first_row) for non-lists.
-    if (dst_pos >= 0) {


I was going to say: what the heck are all these changes? Then I realized it was just indentation. :)

nvdbaranec · 2025-12-12T19:16:01Z

cpp/src/io/parquet/decode_fixed.cu

-    if (skipped_leaf_values > 0) {
-      if (should_process_nulls) {
-        skip_decode<rolling_buf_size>(def_decoder, skipped_leaf_values, t);
+  // Skip ahead in the decoding so that we don't repeat work


Opinion: this kernel has a whole lot of code in one function. Might be worth refactoring this whole skip-stuff block into it's own function if it's not too disruptive.

I tried that first, but it ends up needing like 12 template arguments and passes like 8 parameters so the function call is about 20 lines long with the formatting and it ends up being grotesque. To make it simpler I tried using auto for some of the type parameters and it wouldn't compile. Using a lambda with & was just way easier and cleaner.

mhaseeb123

LGTM. Don't have any more suggestions but I would still like to see that lambda go away if possible 😄

pmattione-nvidia added 2 commits December 10, 2025 18:11

Parquet decode: Skip up to first_row for non-lists

a99bea1

remove further checks

92f59b9

pmattione-nvidia self-assigned this Dec 10, 2025

pmattione-nvidia requested a review from a team as a code owner December 10, 2025 23:28

pmattione-nvidia requested a review from shrshi December 10, 2025 23:28

pmattione-nvidia added the Performance Performance related issue label Dec 10, 2025

pmattione-nvidia requested a review from lamarrr December 10, 2025 23:28

pmattione-nvidia added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Dec 10, 2025

github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label Dec 10, 2025

pmattione-nvidia requested review from mhaseeb123 and nvdbaranec December 10, 2025 23:28

pmattione-nvidia commented Dec 10, 2025

View reviewed changes

Merge branch 'main' into chunked_skip_flat

00bf343

mhaseeb123 reviewed Dec 11, 2025

View reviewed changes

cpp/src/io/parquet/decode_fixed.cu Show resolved Hide resolved

mhaseeb123 reviewed Dec 11, 2025

View reviewed changes

mhaseeb123 reviewed Dec 12, 2025

View reviewed changes

cpp/src/io/parquet/decode_fixed.cu Outdated Show resolved Hide resolved

mhaseeb123 reviewed Dec 12, 2025

View reviewed changes

vuule self-requested a review December 12, 2025 03:49

pmattione-nvidia added 2 commits December 12, 2025 12:23

address comments

89af20c

Merge branch 'chunked_skip_flat' of https://github.com/pmattione-nvid…

0a6debb

…ia/cudf into chunked_skip_flat

nvdbaranec reviewed Dec 12, 2025

View reviewed changes

mhaseeb123 approved these changes Dec 13, 2025

View reviewed changes

Parquet decode: Skip up to first_row for non-lists #20835

Are you sure you want to change the base?

Parquet decode: Skip up to first_row for non-lists #20835

Uh oh!

Conversation

pmattione-nvidia commented Dec 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Checklist

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pmattione-nvidia Dec 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mhaseeb123 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mhaseeb123 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mhaseeb123 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pmattione-nvidia commented Dec 10, 2025 •

edited

Loading

pmattione-nvidia Dec 10, 2025 •

edited

Loading

mhaseeb123 left a comment •

edited

Loading