Skip to content

GH-144651: Optimize the new uops added when recording values during tracing.#144948

Open
markshannon wants to merge 2 commits intopython:mainfrom
markshannon:optimize-with-recorded-values-2
Open

GH-144651: Optimize the new uops added when recording values during tracing.#144948
markshannon wants to merge 2 commits intopython:mainfrom
markshannon:optimize-with-recorded-values-2

Conversation

@markshannon
Copy link
Member

@markshannon markshannon commented Feb 18, 2026

  • Handle dependencies in the optimizer, not the tracer
  • Strengthen some checks to avoid relying on optimizer for correctness

This PR adds optimizations for the new uops added in #144179 and also removes dependencies tracking from the front-end.
By splitting the optimizer stack into two parts, one for locals and one for the evaluation stack, we are able to use the stack before knowing the size of the frame. This is necessary as guards can happen after we set local or stack values, e.g in _FOR_ITER_GEN_FRAME

Also includes some small fixes to ensure that the code emitted by the front-end is stand-alone, meaning that it does not depend on watchers or the optimizer for correctness.

Also add a couple of missing invalidations for instrumentation.

* Handle dependencies in the optimizer, not the tracer
* Strengthen some checks to avoid relying on optimizer for correctness
Copy link
Member

@Fidget-Spinner Fidget-Spinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a really good simplification. Thanks for doing it. Just two comments.

int delta = (int)(new_stack_pointer - current_sp);
assert(delta >= 0);
if (delta) {
/* Shift existing stack elements up */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm this is suspicious. How safe is this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add a guard above that new_stack_pointer is within the limits of the array.

Copy link
Member

@Fidget-Spinner Fidget-Spinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@Fidget-Spinner
Copy link
Member

CI is failing because unoptimized traces can now just execute through invalidated code objects. That's intentional right? As the guard now handles those. I think you can just add @unittest.skipIf(os.getenv("PYTHON_UOPS_OPTIMIZE") == "0", "Needs uop optimizer to run.") to it and make the test pass.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments