Skip to content

Conversation

@jumski
Copy link
Contributor

@jumski jumski commented Jan 15, 2026

Add detailed analysis of how pgflow client can integrate with Vercel AI
SDK's useChat hook and related features. Document includes:

  • Architecture analysis of both systems
  • Three integration patterns (backend API, custom transport, hybrid)
  • Event mapping between pgflow and AI SDK stream protocol
  • Type safety considerations and implementation examples
  • Advanced use cases (multi-step progress, tool calling, multimodal)
  • Production recommendations and testing strategies
  • Comparison with alternative approaches

This enables pgflow to be used as a backend for chat applications built
with Vercel AI SDK while leveraging pgflow's workflow orchestration and
database-backed state management.

Add detailed analysis of how pgflow client can integrate with Vercel AI
SDK's useChat hook and related features. Document includes:

- Architecture analysis of both systems
- Three integration patterns (backend API, custom transport, hybrid)
- Event mapping between pgflow and AI SDK stream protocol
- Type safety considerations and implementation examples
- Advanced use cases (multi-step progress, tool calling, multimodal)
- Production recommendations and testing strategies
- Comparison with alternative approaches

This enables pgflow to be used as a backend for chat applications built
with Vercel AI SDK while leveraging pgflow's workflow orchestration and
database-backed state management.
@changeset-bot
Copy link

changeset-bot bot commented Jan 15, 2026

⚠️ No Changeset found

Latest commit: 835af8c

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@nx-cloud
Copy link

nx-cloud bot commented Jan 15, 2026

View your CI Pipeline Execution ↗ for commit 835af8c

Command Status Duration Result
nx affected -t verify-exports --base=origin/mai... ✅ Succeeded <1s View ↗
nx affected -t build --configuration=production... ✅ Succeeded <1s View ↗
nx affected -t lint typecheck test --parallel -... ✅ Succeeded <1s View ↗

☁️ Nx Cloud last updated this comment at 2026-01-15 20:53:41 UTC

Provides critical analysis of when the pgflow + AI SDK integration makes
sense versus when to use AI SDK alone. Key findings:

- Strong fit (20% of users): Production enterprise apps with complex
  multi-step workflows, state persistence needs, observability requirements
- Poor fit (80% of users): Simple chatbots, prototypes, speed-critical apps
- Trade-offs documented: complexity overhead, latency, vs. reliability gains

Includes decision framework, competitive comparison (LangChain, Temporal),
and gradual adoption path recommendation. Honest assessment prevents
over-engineering while highlighting genuine value for complex use cases.
Implements a frontend-first approach where pgflow client runs in the
browser as a custom ChatTransport, eliminating the need for backend
API routes.

Key components:
- PgflowChatTransport: Custom transport implementing ChatTransport
  interface, connects useChat to pgflow via Supabase Realtime
- StreamingContext: API for pgflow steps to emit incremental data
  (text chunks, reasoning, custom data) broadcast via Realtime
- Helper utilities: streamOpenAIResponse, streamAISDKResponse for
  common streaming patterns
- Example flow: Multi-step chat with intent classification, context
  retrieval, and streaming LLM responses
- Example React component: Full chat UI with progress indicators

Architecture benefits:
- Direct browser → Supabase connection (no API routes)
- Real-time streaming via WebSocket (Supabase Realtime)
- Type-safe end-to-end (Flow types → UI types)
- RLS policies enforce security
- Automatic reconnection on network failures

This approach is cleaner than backend API routes and leverages pgflow's
event-driven architecture naturally. All streaming happens via Supabase
Realtime channels, making implementation straightforward.

Includes comprehensive documentation in FRONTEND_TRANSPORT_DESIGN.md
with implementation roadmap, comparison tables, and production
considerations.
Provides executive summary of the frontend-first integration approach:
- Why frontend transport is better than backend API routes
- Key components (StreamingContext, PgflowChatTransport)
- Complete flow examples (backend + frontend)
- Implementation requirements for pgflow core
- Architecture diagrams and security model
- Performance characteristics and migration guide
- Comparison with alternatives

This document serves as the quick-start guide for developers who want
to understand the integration without reading the full technical design.
Comprehensive analysis of frontend transport viability with focus on
edge runtime timeout recovery - the critical failure scenario identified.

Key additions:

1. EDGE_CASES_AND_RECOVERY.md (12k words):
   - Detailed analysis of edge runtime shutdown mid-stream
   - Ephemeral vs durable streaming event storage
   - Four recovery solutions with trade-offs
   - Latency analysis: frontend transport 3-5x slower (91-231ms vs 28-48ms)
   - Viability assessment per use case
   - Decision tree for architecture choice

2. Streaming context with persistence:
   - Dual-write pattern: Realtime (fast) + Database (durable)
   - Batch writes to reduce overhead (10 chunks or 1s interval)
   - Checkpoint support for long-running operations
   - Automatic finalization with cleanup
   - Migration SQL for streaming_chunks table

3. PgflowChatTransport with recovery:
   - Timeout detection (default 30s)
   - Chunk replay from database on reconnection
   - Graceful partial response handling
   - Automatic recovery attempts
   - Comprehensive error handling

4. Complete production example:
   - End-to-end implementation with all scenarios
   - Testing guide for 4 recovery scenarios
   - Performance monitoring code
   - Cost analysis ($9/month for 1000 daily chats)
   - Production checklist

Critical findings:
- ✅ Viable for multi-step pipelines (Perplexity-style)
- ✅ Recoverable with chunk storage
- ⚠️  3-5x slower per token than direct SSE
- ⚠️  Requires careful timeout management
- ❌ Not suitable for low-latency chat

Recommendation: Use frontend transport for complex multi-step workflows
where intermediate progress visibility matters more than token-level
latency. Implement chunk storage for production reliability.
Based on critical user feedback identifying two key issues:

1. Supabase Realtime is NOT streaming - it's pub/sub messaging with
   discrete events, not continuous data flow. Each token = separate
   network round-trip with JSON overhead. This explains 3-5x latency.

2. Edge runtime timeout affects ALL edge functions, not just pgflow.
   Any proxy that streams LLM responses has this issue (25s limit on
   Vercel Edge). This is a universal problem, not architecture-specific.

The right solution:
- Use Node.js runtime API routes (300s timeout, not 25s)
- Stream LLM tokens via traditional SSE (28-48ms latency)
- Use Realtime ONLY for coarse-grained events (step completions)
- Let pgflow handle multi-step orchestration in database
- Map pgflow events to SSE data chunks for progress updates

Architecture:
Frontend useChat → Node.js API route → SSE streaming (tokens)
                                    ↕
                              Realtime (step events)
                                    ↕
                              Pgflow (orchestration)

Benefits:
- Fast token streaming (no Realtime overhead)
- Long timeouts (300s+, no edge issues)
- Simple architecture (standard patterns)
- Low cost (~/month vs /month)
- Pgflow benefits preserved (orchestration, observability)

This is the sensible approach. Frontend transport via Realtime should
only be used if you truly need browser→database direct connection
(offline-first, browser extensions, no backend allowed).

For 99% of use cases, this hybrid approach is better.
Based on user's insight: separate preparation from streaming.

Two-phase approach:
1. Pgflow flow: Multi-step preparation (search, rank, extract)
   - Durable, stored in database
   - Can take 30s+, no timeout issues
   - Frontend shows progress via Realtime events

2. Separate streaming endpoint: Simple LLM proxy
   - Reads preparation output from DB
   - Streams LLM response via standard SSE
   - Fast (28-48ms per token)
   - Can use Edge runtime (25s is fine for proxying)

Flow:
User message → POST /api/prepare (starts pgflow)
              ↓ (wait for completion, show progress)
Context ready → POST /api/stream (proxy with context)
              ↓ (standard SSE streaming)
Response displayed

Advantages:
- Clean separation: pgflow = orchestration, stream = proxy
- Fast streaming (no pgflow overhead)
- Reusable context (can regenerate without re-preparing)
- Simple streaming endpoint (easy to test)
- Can cache preparation results
- Flexible frontend (polling or Realtime)

This is the best architecture for Perplexity-style multi-step apps:
- Complex preparation (search/rank/extract)
- Simple streaming (just proxy with context)
- Show step-by-step progress
- May regenerate response multiple times

Much cleaner than mixing pgflow events with SSE in same endpoint.
The best solution based on user's vision: wire pgflow step events
directly to AI SDK chunks in a single unified stream.

Key concept: ONE continuous SSE stream that includes:
1. Pgflow step events → AI SDK data chunks (preparation progress)
2. LLM tokens → AI SDK text-delta chunks (final response)

User experience:
User: 'What is quantum computing?'
AI:  [Searching knowledge base...]
     [Found 15 results]
     [Ranking by relevance...]
     [Extracting key information...]
     [Generating response...]
     Quantum computing is a type of...

All in one useChat conversation!

Implementation:
- Single /api/chat endpoint (Node.js, 300s timeout)
- Starts pgflow flow
- Subscribes to step events via Realtime
- Converts pgflow events → SSE data chunks
  - step:started → data-progress (status started)
  - step:completed → data-{stepName} (result) + data-progress (completed)
- When preparation completes, starts LLM streaming
- Streams LLM tokens → text-delta chunks
- Frontend uses standard useChat with onData handler

Event mapping:
Pgflow: { event_type: 'step:completed', step_slug: 'search', output: {...} }
  → SSE: data: {"type":"data-search","data":{...}}
  → useChat onData receives chunk
  → Display progress in UI

Advantages:
- Everything in one conversation (unified UX)
- Real-time progress updates (pgflow events)
- Fast LLM streaming (standard SSE, 28-48ms)
- Standard patterns (useChat, no custom transport)
- Pgflow benefits preserved (orchestration, durability)

Perfect for Perplexity-style apps where users want to see:
- What the AI is doing (searching, ranking, analyzing)
- Intermediate results (found X results, selected top Y)
- Final response with sources

This is the architecture to implement for multi-step AI chat with
real-time progress visibility.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants