-
Notifications
You must be signed in to change notification settings - Fork 991
Parquet decode: Skip up to first_row for non-lists #20835
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Parquet decode: Skip up to first_row for non-lists #20835
Conversation
| }(); | ||
|
|
||
| // dst_pos may be negative (values before first_row) for non-lists. | ||
| if (dst_pos >= 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dst_pos can no longer be negative (it was negative before because we subtract first_row but we start there now). This and the following section look like a bunch of changes, but it's just white-space from removing the if.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the hint. Hiding whitespaces def helps here
| }(); | ||
|
|
||
| // dst_pos may be negative (values before first_row) for non-lists. | ||
| if (dst_pos >= 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No changes in this whole section except removing the if and whitespace changes from shifting the code in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was going to say: what the heck are all these changes? Then I realized it was just indentation. :)
| * @return Maximum depth valid count after processing | ||
| */ | ||
| template <int decode_block_size, typename level_t, bool is_nested, int rolling_buf_size> | ||
| __device__ int skip_validity_and_row_indices_nonlist( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just a gutted version of update_validity_and_row_indices_flat(), where all of the unnecessary code has been removed (e.g. filling validity flags).
|
|
||
| // sum null counts. we have to do it this way instead of just incrementing by (value_count - | ||
| // valid_count) because valid_count also includes rows that potentially start before our row | ||
| // bounds. if we could come up with a way to clean that up, we could remove this and just |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this change "cleans that up"
|
|
||
| // sum null counts. we have to do it this way instead of just incrementing by (value_count - | ||
| // valid_count) because valid_count also includes rows that potentially start before our row | ||
| // bounds. if we could come up with a way to clean that up, we could remove this and just |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this change "cleans that up"
| }(); | ||
|
|
||
| // lookup input string pointer & length. store length. | ||
| bool const in_range = (thread_pos < target_pos) && (dst_pos >= 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dst_pos can no longer be negative, same as in the other decode functions
| if (skipped_leaf_values > 0) { | ||
| if (should_process_nulls) { | ||
| skip_decode<rolling_buf_size>(def_decoder, skipped_leaf_values, t); | ||
| // Skip ahead in the decoding so that we don't repeat work |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the actual skipping. the list skipping is refactored slightly to share common code with non-lists (skip_bools()).
|
|
||
| // Non-lists | ||
| int const first_row = s->first_row; | ||
| if (first_row <= 0) { return; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the non-list skipping. Note that if we don't process nulls we know exactly how far to skip and no looping is needed.
mhaseeb123
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initial pass and so far so good. One small question
mhaseeb123
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small comments; looks good otherwise. Waiting for all tests to pass and a final pass before approve
…ia/cudf into chunked_skip_flat
| }(); | ||
|
|
||
| // dst_pos may be negative (values before first_row) for non-lists. | ||
| if (dst_pos >= 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was going to say: what the heck are all these changes? Then I realized it was just indentation. :)
| if (skipped_leaf_values > 0) { | ||
| if (should_process_nulls) { | ||
| skip_decode<rolling_buf_size>(def_decoder, skipped_leaf_values, t); | ||
| // Skip ahead in the decoding so that we don't repeat work |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opinion: this kernel has a whole lot of code in one function. Might be worth refactoring this whole skip-stuff block into it's own function if it's not too disruptive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried that first, but it ends up needing like 12 template arguments and passes like 8 parameters so the function call is about 20 lines long with the formatting and it ends up being grotesque. To make it simpler I tried using auto for some of the type parameters and it wouldn't compile. Using a lambda with & was just way easier and cleaner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Don't have any more suggestions but I would still like to see that lambda go away if possible 😄
This changes the parquet decode to skip decoding work up to first_row for non-lists. We were already skipping ahead for lists, which was easy to do because there is a lot of list preprocessing that tells us exactly how far to skip ahead.
We don't have that for non-lists with nulls, so we have to do some work. Specifically, if we skip N rows, we don't know how many valids we are skipping so we have to loop through and figure that out. This loop (with the call to skip_validity_and_row_indices_nonlist()) is basically the same as the normal decode processing with all of the extra superfluous code/work stripped out. We then can skip ahead with the dictionary and bool stream decode as before.
This does not immediately improve performance, as the bottlenecks are elsewhere. However, skipping ahead simplifies the rest of the code, removing weird corner cases (see removed if-checks) in the code that were necessary without the skipping. This will make further simplifications and optimizations to the code easier later on.
Checklist