Skip to content

Allow SQL expressions in array condition keys#266

Open
opengeek wants to merge 1 commit intomodxcms:3.xfrom
opengeek:improve-condition-parsing
Open

Allow SQL expressions in array condition keys#266
opengeek wants to merge 1 commit intomodxcms:3.xfrom
opengeek:improve-condition-parsing

Conversation

@opengeek
Copy link
Member

By avoiding attempting to escape the keys in an associative array if they are not a valid column identifier allows more syntax to be expressed in array conditions, e.g.

[
    'column1' => 'value1',
    'OR:JSON_UNQUOTE(JSON_EXTRACT(properties, "$.column2")):LIKE' => '%value2%'
]

This allows more syntax to be expressed in array conditions, e.g.

```
[
    'column1' => 'value1',
    'OR:JSON_UNQUOTE(JSON_EXTRACT(properties, "$.column2")):LIKE' => '%value2%'
]
```
@opengeek opengeek marked this pull request as ready for review February 25, 2026 17:57
@opengeek opengeek added this to the 3.2.0 milestone Feb 25, 2026
@opengeek opengeek requested a review from Copilot February 25, 2026 18:06
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request enables SQL expressions to be used as keys in xPDO query condition arrays by introducing database-specific validation of column identifiers. Previously, all keys were assumed to be simple column names and were automatically escaped. Now, the code checks if a key is a valid column identifier for the specific database engine; if not, it treats the key as a SQL expression and uses it directly in the query.

Changes:

  • Added isColumnIdentifier() abstract method and database-specific implementations to distinguish column names from SQL expressions
  • Modified parseConditions() to conditionally escape keys only when they are valid column identifiers
  • Extended _quotable array to include PHP types for use with non-column-identifier conditions
  • Added comprehensive test coverage for SQL expressions in condition keys across all supported databases

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
src/xPDO/Om/xPDOQuery.php Core logic changes to support SQL expressions as condition keys, including new isColumnIdentifier check and isQuotable helper method
src/xPDO/Om/mysql/xPDOQuery.php MySQL-specific implementation of isColumnIdentifier with support for backticks and double quotes
src/xPDO/Om/pgsql/xPDOQuery.php PostgreSQL-specific implementation supporting double quotes and Unicode escape sequences
src/xPDO/Om/sqlite/xPDOQuery.php SQLite-specific implementation supporting multiple identifier formats
src/xPDO/Om/sqlsrv/xPDOQuery.php SQL Server-specific implementation with square bracket support
src/xPDO/Om/xPDOCriteria.php Added PHPDoc type hint for $xpdo property
test/xPDO/Test/Om/xPDOQueryConditionsTest.php Extensive test cases for SQL expressions with database-specific syntax
test/mysql.phpunit.xml Registered new test file for MySQL test suite
test/sqlite.phpunit.xml Registered new test file for SQLite test suite
test/pgsql.phpunit.xml Registered new test file for PostgreSQL test suite
test/complete.phpunit.xml Registered new test file for complete test suite
Comments suppressed due to low confidence (1)

test/xPDO/Test/Om/xPDOQueryConditionsTest.php:46

  • The newObject()->save() call returns a boolean indicating success, but the return value is not checked. If the save fails, all tests will fail with confusing errors. Consider checking the return value and throwing an exception or logging an error if the save fails to make test failures easier to debug.
            $this->xpdo->newObject('xPDO\Test\Sample\xPDOSample', [
                'parent' => 0,
                'unique_varchar' => uniqid('prefix_'),
                'varchar' => 'varchar',
                'text' => 'text',
                'timestamp' => 'CURRENT_TIMESTAMP',
                'unix_timestamp' => strtotime('2018-03-24 00:00:00'),
                'date_time' => '2018-03-24 00:00:00',
                'date' => '2018-03-24',
                'enum' => 'T',
                'password' => 'password',
                'integer' => 1999,
                'float' => 3.14159,
                'boolean' => true,
            ])->save();

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +125 to +138
'LOG(`integer`)' => 7.6009,
[
'OR:text:=' => 'text',
'date_time' => '2018-03-24 00:00:00'
]
]
],
[
[
'date_time' => null,
'OR:LOG(`integer`):=' => 7.6009,
'OR:float:=' => 3.14159,
]
],
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comparing floating-point values with exact equality (7.6009, 3.14159) can lead to test failures due to floating-point precision issues. Different database engines may return slightly different representations of these values. Consider using approximate comparisons or rounding in the actual query conditions to make the tests more robust.

Copilot uses AI. Check for mistakes.
Comment on lines +69 to +84
protected $_quotable = [
'string',
'password',
'date',
'datetime',
'timestamp',
'time',
'json',
'array',
'float',
'double',
'object',
'resource',
'unknown type',
'resource (closed)'
];
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _quotable array now includes PHP types returned by gettype() ('double', 'object', 'resource', 'unknown type', 'resource (closed)'), but it previously contained database field types ('string', 'password', 'date', 'datetime', 'timestamp', 'time', 'json', 'array', 'float'). This mixed usage creates confusion: the array is used with phptype from field metadata (which are database types) at line 781-783, and with gettype() results (which are PHP types) at line 969. The 'float' entry could match both phptype 'float' and gettype 'double', but 'array' is a database type that won't be returned by gettype. Consider separating these into two distinct arrays or documenting which values correspond to which use case.

Copilot uses AI. Check for mistakes.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, they are pseudo-PHP types already, not direct representations of database field types, meant to bridge actual database types to a type represented in PHP. This was added so that conditions that do not directly represent a field in the database model can handle values appropriately in the resulting PDO prepared statement.

Comment on lines +193 to +203
'LOG("integer")' => 7.6009,
[
'OR:text:=' => 'text',
'date_time' => '2018-03-24 00:00:00'
]
]
],
[
[
'date_time' => null,
'OR:LOG("integer"):=' => 7.6009,
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PostgreSQL's LOG() function returns the base-10 logarithm, not the natural logarithm. For PostgreSQL, LOG(1999) ≈ 3.3008, not 7.6009. The test expects 7.6009 which is ln(1999), the natural logarithm. To get the natural logarithm in PostgreSQL, use LN() instead of LOG(). This test will fail on PostgreSQL with the current expected value.

Copilot uses AI. Check for mistakes.
Comment on lines +159 to +169
'LOG(`integer`)' => 7.6009,
[
'OR:text:=' => 'text',
'date_time' => '2018-03-24 00:00:00'
]
]
],
[
[
'date_time' => null,
'OR:LOG(`integer`):=' => 7.6009,
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SQLite does not have a built-in LOG() function. This test will fail on SQLite unless a custom LOG() function has been registered. Consider using a function that exists in SQLite by default, or document that a custom LOG() function must be registered for these tests to pass.

Copilot uses AI. Check for mistakes.
],
[
[
'LOG("integer")' => 7.6009,
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test expects LOG(1999) to equal 7.6009, but this appears to be an incorrect calculation. In MySQL, LOG() is the natural logarithm (base e), and LN(1999) ≈ 7.6009. However, most database engines' LOG() function uses base 10, where LOG10(1999) ≈ 3.3008. This test may fail on databases where LOG() means base-10 logarithm. Either use LN() for consistency across databases, or verify that this is the correct expected value for each database's LOG() function.

Copilot uses AI. Check for mistakes.
Comment on lines 755 to 820
@@ -785,12 +812,12 @@ public function parseConditions($conditions, $conjunction = xPDOQuery::SQL_AND)
$this->xpdo->log(xPDO::LOG_LEVEL_ERROR, "Encountered empty {$operator} condition with key {$key}");
}
$val = "(" . implode(',', $vals) . ")";
$sql = "{$this->xpdo->escape($alias)}.{$this->xpdo->escape($key)} {$operator} {$val}";
$sql = "{$operand} {$operator} {$val}";
$result[]= new xPDOQueryCondition(array('sql' => $sql, 'binding' => null, 'conjunction' => $conj));
continue;
}
$field= array ();
$field['sql']= $this->xpdo->escape($alias) . '.' . $this->xpdo->escape($key) . ' ' . $operator . ' ?';
$field['sql']= $operand . ' ' . $operator . ' ?';
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In parseConditions, non-identifier condition keys are now used verbatim as the left-hand SQL operand via the operand variable (e.g. {$operand} {$operator} ...), with only the value bound as a PDO parameter. If any part of the conditions array key can be influenced by user input (for example, dynamic filter fields), an attacker can supply arbitrary SQL expressions or subqueries here (that do not contain UNION/;), leading to SQL injection despite parameter binding on the value. To avoid this, restrict/whitelist condition keys to safe column identifiers (those that pass isColumnIdentifier) and require any raw SQL expressions to go through a separate, clearly "unsafe" API that is never used with untrusted data.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants