Skip to content

Validate attribute manipulation PHP syntax on metadata push#1916

Merged
kayjoosten merged 2 commits intomainfrom
feature/issue-1778-php-parsing
Mar 2, 2026
Merged

Validate attribute manipulation PHP syntax on metadata push#1916
kayjoosten merged 2 commits intomainfrom
feature/issue-1778-php-parsing

Conversation

@kayjoosten
Copy link
Contributor

When entities are pushed to /api/connections, any manipulation_code is now syntax-checked using token_get_all() with TOKEN_PARSE before being stored. This catches PHP parse errors at push time instead of at login time.

If a syntax error is found, a RuntimeException is thrown with the entity ID, which the controller returns as a 400 Bad Request. The entire push is rejected if any entity has invalid manipulation code.

Resolves: #1778

@kayjoosten kayjoosten requested a review from johanib February 25, 2026 11:36
@kayjoosten
Copy link
Contributor Author

Validation approach: findings & performance

Approaches considered

Approach Executes code? External process? Verdict
token_get_all($code, TOKEN_PARSE) ❌ No ❌ No Chosen
eval() ✅ Yes ❌ No ❌ Rejected — executes valid code, causing side-effects
php -l via exec()/proc_open() ❌ No ✅ Yes ❌ Rejected — spawns a process per service, very slow
PhpParser (nikic/php-parser) ❌ No ❌ No ❌ Rejected — adds an external dependency
phpstan ❌ No ✅ Yes ❌ Rejected — not suited for runtime use, spawns process

Why token_get_all with TOKEN_PARSE:

  • Built into PHP, zero dependencies.
  • The TOKEN_PARSE flag makes PHP run a real parse pass (same parser used at compile time) and throw a ParseError on any syntax error — without executing the code.
  • eval() would also surface syntax errors, but it executes the code body in the current context, which causes unpredictable side-effects even for syntactically valid code.
  • A subprocess approach (php -l) is the most isolated but adds significant overhead per service and requires exec/proc_open permissions.

Performance impact

Benchmark run locally (PHP 8.2, 1000 iterations, warmed up):

Scenario No validation token_get_all + TOKEN_PARSE Overhead/call
Simple one-liner ~0.000 ms ~0.001 ms +0.001 ms
Realistic ~15-line manipulation ~0.000 ms ~0.006 ms +0.006 ms

Simulated metadata push of 500 services (all with manipulation code):

  • Simple code: ~0.28 ms total
  • Realistic multi-line code: ~3.25 ms total

In practice, not all services have manipulation code, so the real-world overhead will be even lower. The added validation cost during a metadata push is negligible.

When entities are pushed to /api/connections, any manipulation_code
is now syntax-checked using token_get_all() with TOKEN_PARSE before
being stored. This catches PHP parse errors at push time instead of
at login time.

If a syntax error is found, a RuntimeException is thrown with the
entity ID, which the controller returns as a 400 Bad Request. The
entire push is rejected if any entity has invalid manipulation code.

Resolves: #1778
@kayjoosten kayjoosten force-pushed the feature/issue-1778-php-parsing branch from b8b62ba to 1a8c115 Compare February 26, 2026 08:46
@kayjoosten kayjoosten requested a review from johanib February 26, 2026 08:47
Copy link
Contributor

@johanib johanib left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation.

Maybe add an entry to the changelog? "Metadata push will now reject all metadata if any service contains invalid attribute manipulations."

Two small comments, but no blockers 👍

@kayjoosten kayjoosten merged commit e1329f3 into main Mar 2, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

When receiving new entities, EB should check attribute manipulations for parse/syntax errors

3 participants