Skip to content

Conversation

@victorm-lc
Copy link

@victorm-lc victorm-lc commented Nov 19, 2025

Add with_raw_response support to OpenAI and Anthropic wrappers

Problem

When using the with_raw_response API to access HTTP headers in OpenAI and Anthropic clients, the LangSmith wrappers were logging the raw response wrapper objects instead of the parsed response data. This prevented users from accessing HTTP headers (like rate limits, request IDs) while maintaining proper observability.

Example of the issue:

# Before fix - trace logs wrapper object, not parsed data
raw_response = client.chat.completions.with_raw_response.create(...)
headers = raw_response.headers  # ✅ Works
data = raw_response.parse()     # ✅ Works

# ❌ But LangSmith trace shows: 
# {"output": "<APIResponse [200 OK] type=<class 'openai.types.chat.chat_completion.ChatCompletion'>>"}

Root Cause

The wrappers only wrapped the standard .create() methods, not the .with_raw_response.create() methods. When with_raw_response is used, the SDKs return a wrapper object (APIResponse for OpenAI, LegacyAPIResponse for Anthropic) that needs to be parsed to extract the actual response data.

Solution

  1. Added parsing logic to detect and unwrap response wrapper objects:

  2. Wrapped with_raw_response.create methods:

    • client.chat.completions.with_raw_response.create (OpenAI)
    • client.completions.with_raw_response.create (OpenAI)
    • client.messages.with_raw_response.create (Anthropic)
  3. Updated docstrings with with_raw_response examples

Changes

OpenAI Wrapper (langsmith/wrappers/_openai.py)

def _process_chat_completion(outputs: Any):
    try:
        # Check if outputs is an APIResponse wrapper (from with_raw_response).
        # Call .parse() to extract the ChatCompletion/Completion for tracing.
        # See: https://github.com/openai/openai-python/blob/main/src/openai/_response.py#L285
        if hasattr(outputs, "parse") and callable(outputs.parse):
            try:
                outputs = outputs.parse()
            except Exception:
                pass
        # ... rest of processing

Wrapped methods:

  • client.chat.completions.with_raw_response.create
  • client.completions.with_raw_response.create

Anthropic Wrapper (langsmith/wrappers/_anthropic.py)

def _process_chat_completion(outputs: Any):
    try:
        # Check if outputs is a LegacyAPIResponse wrapper (from with_raw_response).
        # Call .parse() to extract the Message for tracing.
        # See: https://github.com/anthropics/anthropic-sdk-python/blob/main/src/anthropic/_legacy_response.py#L102
        if hasattr(outputs, "parse") and callable(outputs.parse):
            try:
                outputs = outputs.parse()
            except Exception:
                pass
        # ... rest of processing

Wrapped methods:

  • client.messages.with_raw_response.create

Testing

New Integration Tests

  • tests/integration_tests/wrappers/test_with_raw_response.py - OpenAI tests
  • tests/integration_tests/wrappers/test_anthropic_with_raw_response.py - Anthropic tests

Both test suites verify:

  1. ✅ HTTP headers are accessible via raw_response.headers
  2. ✅ Response can be parsed via raw_response.parse()
  3. ✅ LangSmith traces capture the parsed response, not the wrapper object
  4. ✅ Both sync and async clients work correctly

Test Results

# OpenAI tests
pytest tests/integration_tests/wrappers/test_with_raw_response.py -v
# ✅ 2 passed

pytest tests/integration_tests/wrappers/test_openai.py -v
# ✅ 26 passed (no regressions)

# Anthropic tests
pytest tests/integration_tests/wrappers/test_anthropic_with_raw_response.py -v
# ✅ 2 passed

pytest tests/integration_tests/wrappers/test_anthropic.py -v
# ✅ 10 passed (no regressions)

Usage Example

OpenAI

from langsmith import wrappers
import openai

client = wrappers.wrap_openai(openai.Client())

# Access headers AND get proper tracing
raw_response = client.chat.completions.with_raw_response.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello"}]
)

# ✅ Access HTTP headers
print(raw_response.headers.get("x-ratelimit-remaining"))

# ✅ Get parsed response
completion = raw_response.parse()
print(completion.choices[0].message.content)

# ✅ LangSmith trace shows parsed completion, not wrapper object

Anthropic

from langsmith import wrappers
import anthropic

client = wrappers.wrap_anthropic(anthropic.Anthropic())

# Access headers AND get proper tracing
raw_response = client.messages.with_raw_response.create(
    model="claude-3-5-sonnet-latest",
    max_tokens=1000,
    messages=[{"role": "user", "content": "Hello"}]
)

# ✅ Access HTTP headers
print(raw_response.headers.get("request-id"))

# ✅ Get parsed response
message = raw_response.parse()
print(message.content[0].text)

# ✅ LangSmith trace shows parsed message, not wrapper object

Design Decisions

Minimalism: Only wrapped the primary .create() methods with with_raw_response. Did not wrap:

  • .parse() methods (Structured Outputs - different feature)
  • Rarely-used legacy APIs
  • Other unrelated APIs

This keeps the changes focused on the reported issue without over-engineering.

References

Breaking Changes

None. This is a backward-compatible enhancement that adds support for a previously unsupported API pattern.

Checklist

  • Added new functionality with tests
  • All existing tests pass
  • Updated docstrings with examples
  • Followed contribution guidelines (make format, make lint, make tests)
  • Validated with real API calls and LangSmith traces

elif hasattr(outputs, "parse") and callable(outputs.parse):
# Some versions use .parse() method
try:
outputs = outputs.parse()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit unclear, what versions use .parse()? Older OpenAI SDK versions?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added better comments but comes from here: github.com/openai/openai-python/blob/main/src/openai/_response.py#L285


# Wrap with_raw_response.create for chat completions
if hasattr(client.chat.completions, "with_raw_response"):
client.chat.completions.with_raw_response.create = _get_wrapper( # type: ignore[method-assign]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to wrap .parse here too? And what about the responses API?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so, will remove now!

@victorm-lc victorm-lc changed the title Fix: Add with_raw_response support to OpenAI wrapper Fix: Add with_raw_response support to OpenAI & Anthropic wrapper Nov 19, 2025
# See: anthropics/anthropic-sdk-python _legacy_response.py#L102
if hasattr(outputs, "parse") and callable(outputs.parse):
try:
outputs = outputs.parse()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing I thought of here - I think most often when people use .with_raw_response it's because they want to defer this later for some reason

Won't this consume the body?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants