Skip to content

Conversation

@b41sh
Copy link
Member

@b41sh b41sh commented Dec 11, 2025

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary

This PR introduces a significant enhancement to Inverted Index for VARIANT data type. Previously, when a VARIANT column contained an array of objects (e.g., [{ "fieldA": "value1", "fieldB": "value2" }, { "fieldA": "value3", "fieldB": "value4" }]), and a query to match multiple conditions (e.g., fieldA:value1 AND fieldB:value4), the inverted index return incorrect results.

With this enhancement, Inverted Index for VARIANT now supports precise object matching within arrays. This means users can now formulate QUERY expressions that accurately determine if multiple search criteria are satisfied within a single, specific object inside a JSON array.

for example:

CREATE TABLE t (id int, body variant, INVERTED INDEX idx (body));

INSERT INTO t VALUES
(1, '{"videoInfo":{"extraData":[{ "name": "codecA", "type": "mp4" },{ "name": "codecB", "type": "jpg" }]}}'),
(2, '{"videoInfo":{"extraData":[{ "name": "codecA", "type": "jpg" },{ "name": "codecA", "type": "mp4" }]}}'),
(3, '{"videoInfo":{"extraData":[{ "name": "codecA", "type": "jpg" },{ "name": "codecB", "type": "mp4" }]}}');

SELECT * FROM t WHERE QUERY('body.videoInfo.extraData.name:codecB AND body.videoInfo.extraData.type:jpg');
╭───────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│        id       │                                             body                                            │
│ Nullable(Int32) │                                      Nullable(Variant)                                      │
├─────────────────┼─────────────────────────────────────────────────────────────────────────────────────────────┤
│               1 │ {"videoInfo":{"extraData":[{"name":"codecA","type":"mp4"},{"name":"codecB","type":"jpg"}]}} │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
  • fixes: #[Link the issue here]

Tests

  • Unit Test
  • Logic Test
  • Benchmark Test
  • No Test - Explain why

Type of change

  • Bug Fix (non-breaking change which fixes an issue)
  • New Feature (non-breaking change which adds functionality)
  • Breaking Change (fix or feature that could cause existing functionality not to work as expected)
  • Documentation Update
  • Refactoring
  • Performance Improvement
  • Other (please describe):

This change is Reviewable

@github-actions github-actions bot added the pr-feature this PR introduces a new feature to the codebase label Dec 11, 2025
@b41sh b41sh added the ci-cloud Build docker image for cloud test label Dec 11, 2025
@github-actions
Copy link
Contributor

Docker Image for PR

  • tag: pr-19096-10be53f-1765452515

note: this image tag is only available for internal use.

@b41sh b41sh force-pushed the feat-json-array-path branch from dc9132b to be7a9cf Compare December 15, 2025 02:41
@github-actions
Copy link
Contributor

github-actions bot commented Dec 15, 2025

🤖 CI Job Analysis

Workflow: 20218751964

📊 Summary

  • Total Jobs: 83
  • Failed Jobs: 9
  • Retryable: 0
  • Code Issues: 9

NO RETRY NEEDED

All failures appear to be code/test issues requiring manual fixes.

🔍 Job Details

  • linux / test_compat_fuse: Not retryable (Code/Test)
  • linux / sqllogic / standalone_minio (query, http, parquet): Not retryable (Code/Test)
  • linux / sqllogic / standalone_minio (query, hybrid, parquet): Not retryable (Code/Test)
  • linux / sqllogic / standalone_minio (query, http, native): Not retryable (Code/Test)
  • linux / sqllogic / standalone_minio (query, hybrid, native): Not retryable (Code/Test)
  • linux / sqllogic / cluster (query, 4c16g, http): Not retryable (Code/Test)
  • linux / sqllogic / cluster (query, 4c16g, hybrid): Not retryable (Code/Test)
  • linux / sqllogic / standalone (query, 4c16g, http): Not retryable (Code/Test)
  • linux / sqllogic / standalone (query, 4c16g, hybrid): Not retryable (Code/Test)

🤖 About

Automated analysis using job annotations to distinguish infrastructure issues (auto-retried) from code/test issues (manual fixes needed).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-cloud Build docker image for cloud test pr-feature this PR introduces a new feature to the codebase

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant