-
Notifications
You must be signed in to change notification settings - Fork 280
feat: enable_spark_tests_comet_native_writer #3351
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: enable_spark_tests_comet_native_writer #3351
Conversation
b5cea4f to
951b394
Compare
| cd apache-spark | ||
| rm -rf /root/.m2/repository/org/apache/parquet # somehow parquet cache requires cleanups | ||
| ENABLE_COMET=true ENABLE_COMET_ONHEAP=true ${{ matrix.config.scan-env }} ENABLE_COMET_LOG_FALLBACK_REASONS=${{ github.event.inputs.collect-fallback-logs || 'false' }} \ | ||
| ENABLE_COMET=true ENABLE_COMET_ONHEAP=true ENABLE_COMET_WRITER=true ${{ matrix.config.scan-env }} ENABLE_COMET_LOG_FALLBACK_REASONS=${{ github.event.inputs.collect-fallback-logs || 'false' }} \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we also need to enable
--conf spark.comet.operator.DataWritingCommandExec.allowIncompatible=true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We currently dont have the config available in spark tests through ENABLE_COMET_WRITER=true . Perhaps you meant enabling them in the benchmark suite TPCH / TPCDS as well ?
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3351 +/- ##
============================================
+ Coverage 56.12% 59.99% +3.86%
- Complexity 976 1464 +488
============================================
Files 119 175 +56
Lines 11743 16165 +4422
Branches 2251 2681 +430
============================================
+ Hits 6591 9698 +3107
- Misses 4012 5114 +1102
- Partials 1140 1353 +213 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@comphead , We have bunch of failures largely due to hitting edge cases with comet writer (CTAS , Insert Overwrite which is likely a bug) . I am thinking of two possible options here
|
|
@comphead please kickoff CI whenever you get a chance |
|
My diff files for Spark versions 3.4 and 3.5 missed disabling comet writer for CTAS statements (while it was set for Spark 4) causing above test failures . However, there are only 2 failed tests down from 13 |
|
Seems like there were issues applying diff file for Spark 4 |
|
@comphead all the tests passed .Please take a look whenever you get a chance |
|
All tests passed and the following tests are ignored : |
Which issue does this PR close?
Closes: #3209
I added new environment variable
ENABLE_COMET_WRITERwhich enables native comet writer and runs spark tests3 kinds of tests have been ignored:
charvarcha.sqlfilesRationale for this change
Now that we have a native comet writer, this PR enables spark tests while ignoring the ones which fail
What changes are included in this PR?
Diff file changes to enable comet native writer on spark tests
How are these changes tested?