Skip to content

Conversation

@lixin-wei
Copy link
Contributor

@lixin-wei lixin-wei commented Jan 14, 2026

This PR fixes a subtle data race in async_scope I introduced in a61207c.

The race occurs between __nest_rcvr::__complete() and __when_empty_op::start() (the on_empty() operation).

The Problematic Code Flow

Thread T2 (__complete)           Thread T1 (on_empty)
─────────────────────           ───────────────────
fetch_sub: 1→0
                                lock mutex
                                load __active (sees 0!)
                                unlock
                                start continuation → destroy scope
lock mutex → USE-AFTER-FREE

Solution

added another __pending_notifiers_ counter to guard this case.

I've tested this in my heavy workflow with TSAN.

@copy-pr-bot
Copy link

copy-pr-bot bot commented Jan 14, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@lixin-wei lixin-wei changed the title Fix data race in async_scope [dep issue] Fix data race in async_scope Jan 14, 2026
@lixin-wei lixin-wei changed the title [dep issue] Fix data race in async_scope Fix data race in async_scope Jan 14, 2026
@ericniebler
Copy link
Collaborator

/ok to test e8d9f20

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants