Conversation
|
@Shekharrajak can you run |
andygrove
left a comment
There was a problem hiding this comment.
Review Feedback
Critical Issue: NULL handling bug when len ≤ 0
The implementation has a Spark compatibility issue in strings.scala:
if (lenInt <= 0) {
val literalBuilder = LiteralOuterClass.Literal.newBuilder()
literalBuilder.setIsNull(false)
literalBuilder.setStringVal("")
Some(ExprOuterClass.Expr.newBuilder().setLiteral(literalBuilder).build())
}This returns empty string for len ≤ 0 without checking if str is NULL.
Per Spark's implementation (Right.replacement):
If(IsNull(str), Literal.create(null, dataType), // Check NULL first!
If(LessThanOrEqual(len, Literal(0)),
Literal(UTF8String.EMPTY_UTF8),
Substring(str, UnaryMinus(len), len)))Spark checks IsNull(str) before checking len <= 0. So:
RIGHT(NULL, 0)should return NULL (Spark behavior)RIGHT(NULL, 0)returns empty string (this PR)
Suggested Fix
Either:
- Evaluate the str expression and return NULL when str is NULL (even when len ≤ 0), or
- Mark this edge case as
Incompatible
Missing Test Case
Please add a test for SELECT RIGHT(CAST(NULL AS STRING), 0) to verify NULL handling with non-positive length.
This review was generated with AI assistance.
|
Updated the PR, please trigger the workflow |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3207 +/- ##
============================================
+ Coverage 56.12% 59.87% +3.74%
- Complexity 976 1473 +497
============================================
Files 119 175 +56
Lines 11743 16193 +4450
Branches 2251 2687 +436
============================================
+ Hits 6591 9695 +3104
- Misses 4012 5150 +1138
- Partials 1140 1348 +208 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
checks looking good. Please have a look. |
Closes #3182
Rationale for this change
Enable native execution for Spark's
RIGHT(str, len)function to improve query performance by avoiding JVM fallback.What changes are included in this PR?
CometRightserializer that handles edge cases:Substring(str, -len, len)protobufRightexpression inQueryPlanSerdestring expressions mapHow are these changes tested?
Added 4 test suites in
CometExpressionSuite:SUBSTRING(str, -len, len)