Add standalone activity metrics #8759

fretz12 · 2025-12-04T22:50:06Z

What changed?

Add standalone activity metrics

Why?

Need standalone activity metrics for observability purposes

How did you test it?

Note

Adds comprehensive metrics and payload-size recording for standalone activities, introducing contextual request wrappers and wiring metrics through state transitions, timeout executors, and history APIs.

Activity component/state machine:
- Introduces RequestWithContext carrying Token, MetricsHandler, NamespaceName, and BreakdownMetricsByTaskQueue and replaces prior request wrappers.
- Records metrics on schedule, attempts, and terminal transitions (success, fail, cancel, timeouts), including ActivityStartToCloseLatency, ActivityScheduleToCloseLatency, success/fail/cancel/timeout counters, and per-timeout tags.
- Adds recordPayloadSize(...) to emit payload sizes for input, heartbeat details, results, and failures.
Timeout and dispatch executors (chasm/lib/activity/activity_tasks.go):
- Add timeoutTaskExecutorOptions (dynamic config, metrics, namespace registry); resolve namespace and emit timeout metrics during schedule/start/close/heartbeat timeouts and retries.
Activity handler:
- Injects metrics.Handler and namespace.Registry; StartActivityExecution emits input payload size on scheduling.
History APIs (record/respond activity ops):
- Pass RequestWithContext (token, metrics handler, namespace, breakdown setting) into chasm component calls.
Tests:
- Update/add unit tests to validate metric emissions and payload-size recording across transitions and timeouts.

^{Written by Cursor Bugbot for commit 02c6aba. This will update automatically on new commits. Configure here.}

fretz12 · 2025-12-04T22:51:31Z

chasm/lib/activity/activity.go

-	Token   *tokenspb.Task
-	Request R
+// RequestWithContext wraps a request context specific metadata.
+type RequestWithContext[R any] struct {


Using @dandavison naming. Open to better ideas.

fretz12 · 2025-12-04T23:19:52Z

cursor review

service/history/api/recordactivitytaskheartbeat/api.go

chasm/lib/activity/activity.go

dandavison · 2025-12-04T23:22:03Z

chasm/lib/activity/activity.go

+// RequestWithContext wraps a request context specific metadata.
+type RequestWithContext[R any] struct {
+	Request                     R
+	Token                       *tokenspb.Task


This is specific to certain worker APIs isn't it. So I think it's starting to feel a bit hacky to share the same wrapper across multiple APIs that only need some things.

dandavison · 2025-12-04T23:22:33Z

chasm/lib/activity/activity.go

+	Token                       *tokenspb.Task
+	MetricsHandler              metrics.Handler
+	NamespaceName               namespace.Name
+	BreakdownMetricsByTaskQueue dynamicconfig.BoolPropertyFnWithTaskQueueFilter


Is it definitely appropriate to pass this; DC can be queried where needed, right?

You need to pass this in.

dandavison · 2025-12-04T23:22:48Z

chasm/lib/activity/activity.go

-type WithToken[R any] struct {
-	Token   *tokenspb.Task
-	Request R
+// RequestWithContext wraps a request context specific metadata.


Suggested change

// RequestWithContext wraps a request context specific metadata.

// RequestWithContext wraps a request with context-specific metadata.

dandavison · 2025-12-04T23:31:33Z

chasm/lib/activity/activity.go

+// recordOnAttemptedMetrics records metrics for attempted activities, including retries and originating from any
+// terminal state transitions.


I don't quite follow the "and originating from any terminal state transitions" bit.

For the first bit, would "records metrics for an activity attempt" be clearer?

dandavison · 2025-12-04T23:35:48Z

chasm/lib/activity/statemachine.go

+			event.handler,
+			event.breakdownMetricsByTaskQueue,
+			event.operationTag,
+			event.timeoutType)


I'm finding it odd that we're referencing GetStartedTime time in a transition to Scheduled.

I don't think attempt.GetStartedTime().AsTime() is guaranteed to be zero here, right? So then we'll be emitting ActivityStartToCloseLatency which would be wrong?

dandavison · 2025-12-04T23:39:05Z

chasm/lib/activity/activity.go

+		metricsHandler,
+		breakdownMetricsByTaskQueue,
+		operationTag,
+		timeoutType)


Does this not mean that recordOnAttemptedMetrics() gets called twice, since it is called when a retry transitions to SCHEDULED?

dandavison · 2025-12-04T23:49:15Z

chasm/lib/activity/activity.go

+
+// recordOnAttemptedMetrics records metrics for attempted activities, including retries and originating from any
+// terminal state transitions.
+func (a *Activity) recordOnAttemptedMetrics(


It's not urgent but when we get time I'd like us to think about a consistent and sensible mapping of events to method names. I am already finding methods named "handleX" and "recordX" confusingly similar, and now we're adding "recordX" with a second meaning.

Suggesting the verb emit here.

bergundy · 2025-12-05T01:06:02Z

chasm/lib/activity/activity.go

-	Token   *tokenspb.Task
-	Request R
+// RequestWithContext wraps a request context specific metadata.
+type RequestWithContext[R any] struct {


I would create a struct per request and avoid this generic wrapper.

bergundy

Overall this looks really good. I would consider restructuring the code as I suggested to make it easier to follow.

bergundy · 2025-12-05T01:11:50Z

chasm/lib/activity/activity.go

+
+// recordOnAttemptedMetrics records metrics for attempted activities, including retries and originating from any
+// terminal state transitions.
+func (a *Activity) recordOnAttemptedMetrics(


Suggesting the verb emit here.

bergundy · 2025-12-05T01:21:27Z

chasm/lib/activity/activity.go

+) {
+	taskQueueFamily := a.GetTaskQueue().GetName()
+
+	handler := metrics.GetPerTaskQueueFamilyScope(


I would suggest passing in a fully baked metrics handler into the application logic.

You can build the metrics handler in the API handlers or task executors if needed, it will save you from carrying all of these parameters around.

bergundy · 2025-12-05T01:22:45Z

chasm/lib/activity/activity.go

+		Details:      details,
 	})
+
+	recordPayloadSize(details.Size(), req.MetricsHandler, req.NamespaceName.String(), metrics.HistoryRecordActivityTaskHeartbeatScope)


We don't need these payload size metrics IMHO, we added them to understand the implications of putting payloads in mutable state for standalone activities.

bergundy · 2025-12-05T01:24:09Z

chasm/lib/activity/activity.go

+		breakdownMetricsByTaskQueue(namespaceName, taskQueueFamily, enumspb.TASK_QUEUE_TYPE_ACTIVITY),
+		metrics.OperationTag(operationTag),
+		metrics.ActivityTypeTag(a.GetActivityType().GetName()),
+		// metrics.VersioningBehaviorTag(versioningBehavior), TODO add when we have versioning


You can't not emit this tag here, all metrics have to have the same tags no matter where they're omitted from.

bergundy · 2025-12-05T01:26:17Z

chasm/lib/activity/activity.go

+	scheduleToCloseLatency := time.Since(a.GetScheduledTime().AsTime())
+	metrics.ActivityScheduleToCloseLatency.With(handler).Record(scheduleToCloseLatency)
+
+	switch operationTag {


I think it would be much easier to follow this code if you emitted the appropriate metric where it is relevant.

bergundy · 2025-12-05T01:28:00Z

chasm/lib/activity/activity_tasks.go

+type timeoutTaskExecutorOptions struct {
+	fx.In
+
+	Dc                *dynamicconfig.Collection


Use the config struct that Dan added in a separate PR and initialize it in fx instead of using the collection directly.

bergundy · 2025-12-05T01:30:29Z

chasm/lib/activity/statemachine.go

 				Attempt: attempt.GetCount(),
 			})

+		recordPayloadSize(event.inputSize, event.handler, event.namespace.String(), metrics.HistoryRecordActivityTaskStartedScope)


Same here, not required to record this IMHO.

Added standalone activity metrics.

1a5ceb4

fretz12 commented Dec 4, 2025

View reviewed changes

Fix lint.

02c6aba

fretz12 marked this pull request as ready for review December 4, 2025 23:19

fretz12 requested review from a team as code owners December 4, 2025 23:19

fretz12 requested review from bergundy and dandavison December 4, 2025 23:19

cursor bot reviewed Dec 4, 2025

View reviewed changes

service/history/api/recordactivitytaskheartbeat/api.go Show resolved Hide resolved

chasm/lib/activity/activity.go Outdated Show resolved Hide resolved

Fix wrong handler.

92bb2da

dandavison reviewed Dec 4, 2025

View reviewed changes

bergundy reviewed Dec 5, 2025

View reviewed changes

	// RequestWithContext wraps a request context specific metadata.
	// RequestWithContext wraps a request with context-specific metadata.

		// recordOnAttemptedMetrics records metrics for attempted activities, including retries and originating from any
		// terminal state transitions.

Add standalone activity metrics #8759

Are you sure you want to change the base?

Add standalone activity metrics #8759

Uh oh!

Conversation

fretz12 commented Dec 4, 2025 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changed?

Why?

How did you test it?

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fretz12 commented Dec 4, 2025

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dandavison Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dandavison Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bergundy left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

fretz12 commented Dec 4, 2025 •

edited by cursor bot

Loading

dandavison Dec 4, 2025 •

edited

Loading

dandavison Dec 4, 2025 •

edited

Loading