Skip to content

feat(backend): add database stress-test seed script | ENG-203#300

Merged
LuD1161 merged 1 commit intomainfrom
feat/stress-test-seed-script
Feb 18, 2026
Merged

feat(backend): add database stress-test seed script | ENG-203#300
LuD1161 merged 1 commit intomainfrom
feat/stress-test-seed-script

Conversation

@LuD1161
Copy link
Contributor

@LuD1161 LuD1161 commented Feb 18, 2026

Summary

  • Add backend/scripts/seed-stress-test.ts with tiered data generation (small/medium/large) for stress testing all major database tables
  • Seeds workflows, versions, runs, traces, node I/O, schedules, webhooks, human input requests, artifacts, agent traces, MCP groups/servers/tools, encrypted secrets, and API keys
  • Ensures temporal consistency across all entity relationships (timestamps, parent-child ordering, status/event alignment)
  • Includes load testing plan, first audit report, and load-audit Claude skill

Key design decisions

  • Tiered configs: small (100 runs), medium (2K runs), large (20K runs) for different testing needs
  • Realistic distributions: weighted random status/trigger distributions matching production patterns
  • Temporal integrity: all child entities derive timestamps from parents (no independent randomDate() for related entities)
  • Status consistency: AWAITING_INPUT runs always have pending human input, terminal statuses have appropriate failure traces
  • Auto-cleanup: --clean flag or automatic cleanup before re-seeding prevents data accumulation

Test plan

  • Run bun backend/scripts/seed-stress-test.ts --tier small and verify data in DB
  • Run bun backend/scripts/seed-stress-test.ts --tier small --clean to verify cleanup
  • Verify workflow list page renders seeded data correctly
  • Verify workflow detail/run pages show consistent status and durations
  • Verify RUNNING runs show realistic durations (minutes, not days)

… docs

- Add seed-stress-test.ts with tiered data generation (small/medium/large)
- Seed workflows, versions, runs, traces, node I/O, schedules, webhooks,
  human input requests, artifacts, agent traces, MCP data, secrets, API keys
- Ensure temporal consistency across all entity relationships
- Include load testing plan and first audit report
- Add load-audit skill for automated frontend performance audits

Signed-off-by: Aseem Shrey <LuD1161@users.noreply.github.com>
@LuD1161 LuD1161 changed the title feat(backend): add database stress-test seed script feat(backend): add database stress-test seed script | ENG-203 Feb 18, 2026
@LuD1161 LuD1161 merged commit 5cbb14a into main Feb 18, 2026
3 checks passed
@LuD1161 LuD1161 deleted the feat/stress-test-seed-script branch February 18, 2026 02:21
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4d1ef31e6c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".


// Delete in reverse FK order
const deletions = [
`DELETE FROM agent_trace_events WHERE workflow_run_id IN (SELECT run_id FROM workflow_runs WHERE organization_id = '${ORG_ID}')`,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Parameterize organization id in cleanup/update SQL

SEED_ORG_ID is read from the environment and then interpolated directly into raw SQL (for example in the cleanup DELETE statements and the later UPDATE workflows ... organization_id = '${ORG_ID}'), so an org id containing a quote will break the script and a crafted value can broaden the affected rows beyond the intended tenant. This turns a seeding utility into a potential destructive query path; bind parameters (or at least sqlVal) should be used consistently for ORG_ID in all raw SQL strings.

Useful? React with 👍 / 👎.


for (const run of agentRuns) {
const agentNodeRefs = run.nodeRefs.slice(0, Math.max(1, Math.floor(run.nodeRefs.length / 3)));
const agentRunId = `agent_${shortUUID()}`;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Generate collision-resistant agent run ids

agentRunId is currently built from shortUUID() (agent_${8-hex}), which only provides ~32 bits of entropy; in medium/large seeds this can collide, and the agent trace API reads rows by agent_run_id alone, so collisions merge events from different workflow runs and return incorrect metadata/transcripts. Using a full UUID (or including the workflow run id) avoids silent cross-run data corruption in load-test datasets.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments