Skip to content

feat: Support right expression#3207

Merged
andygrove merged 7 commits intoapache:mainfrom
Shekharrajak:feature/support-right-expression
Feb 9, 2026
Merged

feat: Support right expression#3207
andygrove merged 7 commits intoapache:mainfrom
Shekharrajak:feature/support-right-expression

Conversation

@Shekharrajak
Copy link
Contributor

Closes #3182

Rationale for this change

Enable native execution for Spark's RIGHT(str, len) function to improve query performance by avoiding JVM fallback.

What changes are included in this PR?

  • Added CometRight serializer that handles edge cases:
    • len <= 0: Returns empty string literal
    • len > 0: Transforms to Substring(str, -len, len) protobuf
  • Registered Right expression in QueryPlanSerde string expressions map
  • No protobuf or Rust changes required (reuses existing Substring implementation)

How are these changes tested?

Added 4 test suites in CometExpressionSuite:

  • Basic functionality (various lengths: 0, -1, positive, exceeds length)
  • Unicode character handling (emoji, multi-byte chars)
  • Equivalence verification with SUBSTRING(str, -len, len)
  • Dictionary encoding preservation

@andygrove
Copy link
Member

@Shekharrajak can you run make to update the generated configs.md

Copy link
Member

@andygrove andygrove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Feedback

Critical Issue: NULL handling bug when len ≤ 0

The implementation has a Spark compatibility issue in strings.scala:

if (lenInt <= 0) {
  val literalBuilder = LiteralOuterClass.Literal.newBuilder()
  literalBuilder.setIsNull(false)
  literalBuilder.setStringVal("")
  Some(ExprOuterClass.Expr.newBuilder().setLiteral(literalBuilder).build())
}

This returns empty string for len ≤ 0 without checking if str is NULL.

Per Spark's implementation (Right.replacement):

If(IsNull(str), Literal.create(null, dataType),      // Check NULL first!
  If(LessThanOrEqual(len, Literal(0)),
    Literal(UTF8String.EMPTY_UTF8),
    Substring(str, UnaryMinus(len), len)))

Spark checks IsNull(str) before checking len <= 0. So:

  • RIGHT(NULL, 0) should return NULL (Spark behavior)
  • RIGHT(NULL, 0) returns empty string (this PR)

Suggested Fix

Either:

  1. Evaluate the str expression and return NULL when str is NULL (even when len ≤ 0), or
  2. Mark this edge case as Incompatible

Missing Test Case

Please add a test for SELECT RIGHT(CAST(NULL AS STRING), 0) to verify NULL handling with non-positive length.


This review was generated with AI assistance.

@Shekharrajak
Copy link
Contributor Author

Updated the PR, please trigger the workflow

@codecov-commenter
Copy link

codecov-commenter commented Feb 4, 2026

Codecov Report

❌ Patch coverage is 7.69231% with 24 lines in your changes missing coverage. Please review.
✅ Project coverage is 59.87%. Comparing base (f09f8af) to head (4d37921).
⚠️ Report is 932 commits behind head on main.

Files with missing lines Patch % Lines
...rc/main/scala/org/apache/comet/serde/strings.scala 4.00% 24 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #3207      +/-   ##
============================================
+ Coverage     56.12%   59.87%   +3.74%     
- Complexity      976     1473     +497     
============================================
  Files           119      175      +56     
  Lines         11743    16193    +4450     
  Branches       2251     2687     +436     
============================================
+ Hits           6591     9695    +3104     
- Misses         4012     5150    +1138     
- Partials       1140     1348     +208     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Shekharrajak
Copy link
Contributor Author

checks looking good. Please have a look.

Copy link
Member

@andygrove andygrove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @Shekharrajak

@andygrove andygrove merged commit 7a07db2 into apache:main Feb 9, 2026
170 of 171 checks passed
@Shekharrajak Shekharrajak deleted the feature/support-right-expression branch February 10, 2026 04:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature] Support Spark expression: right

3 participants