diff --git a/README.md b/README.md
index d954a3c00..d8156215e 100644
--- a/README.md
+++ b/README.md
@@ -2,48 +2,34 @@
Get reliable JSON from any LLM. Built on Pydantic for validation, type safety, and IDE support.
-```python
-import instructor
-from pydantic import BaseModel
-
-
-# Define what you want
-class User(BaseModel):
- name: str
- age: int
-
-
-# Extract it from natural language
-client = instructor.from_provider("openai/gpt-4o-mini")
-user = client.chat.completions.create(
- response_model=User,
- messages=[{"role": "user", "content": "John is 25 years old"}],
-)
-
-print(user) # User(name='John', age=25)
-```
-
-**That's it.** No JSON parsing, no error handling, no retries. Just define a model and get structured data.
-
[](https://pypi.org/project/instructor/)
[](https://pypi.org/project/instructor/)
[](https://github.com/instructor-ai/instructor)
[](https://discord.gg/bD9YE9JArw)
[](https://twitter.com/jxnlco)
-> **Use Instructor for fast extraction, reach for PydanticAI when you need agents.** Instructor keeps schema-first flows simple and cheap. If your app needs richer agent runs, built-in observability, or shareable traces, try [PydanticAI](https://ai.pydantic.dev/). PydanticAI is the official agent runtime from the Pydantic team, adding typed tools, replayable datasets, evals, and production dashboards while using the same Pydantic models. Dive into the [PydanticAI docs](https://ai.pydantic.dev/) to see how it extends Instructor-style workflows.
+> **Use Instructor for fast extraction, reach for PydanticAI when you need agents.** Instructor keeps schema-first flows simple and cheap. If your app needs richer agent runs, built-in observability, or shareable traces, try [PydanticAI](https://ai.pydantic.dev/). It is the official agent runtime from the Pydantic team and extends your existing Instructor models with typed tools, replayable datasets, evals, and production dashboards.
+
+## Table of Contents
-## Why Instructor?
+- [Overview](#overview)
+- [Installation](#installation)
+- [Quick start](#quick-start)
+- [Feature highlights](#feature-highlights)
+- [Provider coverage](#provider-coverage)
+- [Production-ready patterns](#production-ready-patterns)
+- [Used in production by](#used-in-production-by)
+- [Documentation & resources](#documentation--resources)
+- [Why Instructor over alternatives?](#why-instructor-over-alternatives)
+- [Contributing](#contributing)
+- [License](#license)
+- [Community](#community)
-Getting structured data from LLMs is hard. You need to:
+## Overview
-1. Write complex JSON schemas
-2. Handle validation errors
-3. Retry failed extractions
-4. Parse unstructured responses
-5. Deal with different provider APIs
+Instructor is the leading open-source toolkit for structured LLM outputs. Define a Pydantic model once and reuse it across 15+ providers (OpenAI, Anthropic, Google, Groq, Mistral, Cohere, Vertex AI, Bedrock, Perplexity, Ollama, DeepSeek, and more). The library offers sync and async clients, retries, streaming (`Partial`, `IterableModel`, `Maybe`), validation helpers, moderation hooks, and batching utilities so that every request returns typed data you can trust. With 3M+ monthly downloads, 10k+ GitHub stars, and 1000+ contributors, Instructor is battle-tested in production workloads.
-**Instructor handles all of this with one simple interface:**
+### Without Instructor vs With Instructor
@@ -74,49 +60,134 @@ response = openai.chat.completions.create(
],
)
-# Parse response
tool_call = response.choices[0].message.tool_calls[0]
user_data = json.loads(tool_call.function.arguments)
-# Validate manually
if "name" not in user_data:
- # Handle error...
- pass
+ raise ValueError("Missing name")
```
|
```python
-client = instructor.from_provider("openai/gpt-4")
+import instructor
+from pydantic import BaseModel
+
+class User(BaseModel):
+ name: str
+ age: int
+
+
+client = instructor.from_provider("openai/gpt-4")
user = client.chat.completions.create(
response_model=User,
messages=[{"role": "user", "content": "..."}],
)
-
-# That's it! user is validated and typed
```
|
-## Install in seconds
+## Installation
+
+Instructor targets Python 3.9+.
+
+### Runtime install
+
+```bash
+uv pip install instructor
+```
+
+Install provider extras as needed:
```bash
-pip install instructor
+uv pip install "instructor[anthropic,google,groq]"
```
-Or with your package manager:
+### Project-managed install
+
```bash
uv add instructor
-poetry add instructor
```
-## Works with every major provider
+### Local development
+
+```bash
+uv venv
+source .venv/bin/activate
+uv pip install -e ".[dev,anthropic]"
+```
+
+## Quick start
+
+### Extract validated objects
+
+```python
+import instructor
+from pydantic import BaseModel
+
+
+class Product(BaseModel):
+ name: str
+ price: float
+ in_stock: bool
+
+
+client = instructor.from_provider("openai/gpt-4o-mini")
+product = client.chat.completions.create(
+ response_model=Product,
+ messages=[{"role": "user", "content": "iPhone 15 Pro, $999, available now"}],
+)
+print(product) # Product(name='iPhone 15 Pro', price=999.0, in_stock=True)
+```
+
+Swap `"openai/gpt-4o-mini"` with any supported provider string without changing the rest of your code.
+
+### Stream partial objects
+
+```python
+from instructor import Partial
+
+for partial_product in client.chat.completions.create(
+ response_model=Partial[Product],
+ messages=[{"role": "user", "content": "Describe a laptop"}],
+ stream=True,
+):
+ print(partial_product)
+```
+
+### Async usage
+
+```python
+import asyncio
+
+async def main() -> None:
+ aclient = instructor.from_provider("openai/gpt-4o-mini", async_client=True)
+ product = await aclient.chat.completions.create(
+ response_model=Product,
+ messages=[{"role": "user", "content": "Steam Deck, $399"}],
+ )
+ print(product)
+
+asyncio.run(main())
+```
+
+## Feature highlights
+
+- **Schema-first development**: Pydantic-powered `BaseModel`s and helpers such as `OpenAISchema`, `generate_openai_schema`, and `generate_anthropic_schema` keep prompts and outputs in sync with your types.
+- **Automatic validation and retries**: Raise `field_validator` errors, plug in `llm_validator` or `openai_moderation`, and let Instructor re-ask with clear error messages until the schema passes.
+- **Streaming and partial results**: Use `Partial`, `IterableModel`, and `Maybe` to stream nested objects or chunked lists as soon as tokens arrive.
+- **Hooks and observability**: `client.on("completion:*", ...)` handlers expose every request and response so you can log, trace, or enforce policy before a response escapes.
+- **Batching and distillation**: Fan out jobs with `BatchProcessor`, `BatchRequest`, and `BatchJob`, or build fine-tuning corpora with `FinetuneFormat` and `Instructions`.
+- **Multimodal inputs**: Attach `Image` and `Audio` payloads to prompts while keeping the same response model API.
+- **Multi-language ecosystem**: Official ports exist for [TypeScript](https://js.useinstructor.com), [Go](https://go.useinstructor.com), [Ruby](https://ruby.useinstructor.com), [Elixir](https://hex.pm/packages/instructor), and [Rust](https://rust.useinstructor.com).
+
+## Provider coverage
-Use the same code with any LLM provider:
+Use one call site for every provider:
```python
# OpenAI
@@ -125,29 +196,24 @@ client = instructor.from_provider("openai/gpt-4o")
# Anthropic
client = instructor.from_provider("anthropic/claude-3-5-sonnet")
-# Google
+# Google Gemini
client = instructor.from_provider("google/gemini-pro")
-# Ollama (local)
+# Groq
+client = instructor.from_provider("groq/llama-3.1-8b-instant")
+
+# Ollama or other local runtimes
client = instructor.from_provider("ollama/llama3.2")
-# With API keys directly (no environment variables needed)
+# Provide keys inline when needed
client = instructor.from_provider("openai/gpt-4o", api_key="sk-...")
-client = instructor.from_provider("anthropic/claude-3-5-sonnet", api_key="sk-ant-...")
-client = instructor.from_provider("groq/llama-3.1-8b-instant", api_key="gsk_...")
-
-# All use the same API!
-user = client.chat.completions.create(
- response_model=User,
- messages=[{"role": "user", "content": "..."}],
-)
```
-## Production-ready features
+Instructor also patches native clients (`from_openai`, `from_anthropic`, `from_vertexai`, `from_bedrock`, `from_mistral`, `from_groq`, `from_litellm`, etc.) so you can keep your existing SDK configuration.
-### Automatic retries
+## Production-ready patterns
-Failed validations are automatically retried with the error message:
+### Automatic retries
```python
from pydantic import BaseModel, field_validator
@@ -157,14 +223,14 @@ class User(BaseModel):
name: str
age: int
- @field_validator('age')
- def validate_age(cls, v):
- if v < 0:
- raise ValueError('Age must be positive')
- return v
+ @field_validator("age")
+ @classmethod
+ def validate_age(cls, value: int) -> int:
+ if value < 0:
+ raise ValueError("Age must be positive")
+ return value
-# Instructor automatically retries when validation fails
user = client.chat.completions.create(
response_model=User,
messages=[{"role": "user", "content": "..."}],
@@ -174,8 +240,6 @@ user = client.chat.completions.create(
### Streaming support
-Stream partial objects as they're generated:
-
```python
from instructor import Partial
@@ -185,17 +249,13 @@ for partial_user in client.chat.completions.create(
stream=True,
):
print(partial_user)
- # User(name=None, age=None)
- # User(name="John", age=None)
- # User(name="John", age=25)
```
### Nested objects
-Extract complex, nested data structures:
-
```python
from typing import List
+from pydantic import BaseModel
class Address(BaseModel):
@@ -204,15 +264,14 @@ class Address(BaseModel):
country: str
-class User(BaseModel):
+class Person(BaseModel):
name: str
age: int
addresses: List[Address]
-# Instructor handles nested objects automatically
-user = client.chat.completions.create(
- response_model=User,
+person = client.chat.completions.create(
+ response_model=Person,
messages=[{"role": "user", "content": "..."}],
)
```
@@ -222,75 +281,62 @@ user = client.chat.completions.create(
Trusted by over 100,000 developers and companies building AI applications:
- **3M+ monthly downloads**
-- **10K+ GitHub stars**
+- **10K+ GitHub stars**
- **1000+ community contributors**
-Companies using Instructor include teams at OpenAI, Google, Microsoft, AWS, and many YC startups.
-
-## Get started
+Teams at OpenAI, Google, Microsoft, AWS, and many YC startups rely on Instructor to keep structured data flows stable.
-### Basic extraction
+## Documentation & resources
-Extract structured data from any text:
-
-```python
-from pydantic import BaseModel
-import instructor
+- [Python docs](https://python.useinstructor.com) – concepts, guides, and API details
+- [Examples gallery](https://python.useinstructor.com/examples/) – copy-paste recipes for common tasks
+- [Provider integrations](https://python.useinstructor.com/integrations/) – setup notes for each LLM vendor
+- [Blog](https://python.useinstructor.com/blog/) – deep dives, tutorials, and release notes
+- [Prompting and validation tips](https://python.useinstructor.com/prompting/) – guidance for schema-first prompts
+- `instructor docs [topic]` – open the docs search from your terminal
-client = instructor.from_provider("openai/gpt-4o-mini")
-
-
-class Product(BaseModel):
- name: str
- price: float
- in_stock: bool
+## Why Instructor over alternatives?
+- **vs Raw JSON mode**: Automatic validation, retries, streaming, and nested schema support with zero manual parsing.
+- **vs LangChain or LlamaIndex**: Instructor focuses on structured extraction, so the API stays light, fast, and easy to debug.
+- **vs Custom glue code**: Thousands of teams have already burned down the edge cases around retries, moderation, schema drift, and provider quirks—Instructor ships those fixes for free.
-product = client.chat.completions.create(
- response_model=Product,
- messages=[{"role": "user", "content": "iPhone 15 Pro, $999, available now"}],
-)
-
-print(product)
-# Product(name='iPhone 15 Pro', price=999.0, in_stock=True)
-```
-
-### Multiple languages
-
-Instructor's simple API is available in many languages:
-
-- [Python](https://python.useinstructor.com) - The original
-- [TypeScript](https://js.useinstructor.com) - Full TypeScript support
-- [Ruby](https://ruby.useinstructor.com) - Ruby implementation
-- [Go](https://go.useinstructor.com) - Go implementation
-- [Elixir](https://hex.pm/packages/instructor) - Elixir implementation
-- [Rust](https://rust.useinstructor.com) - Rust implementation
-
-### Learn more
+## Contributing
-- [Documentation](https://python.useinstructor.com) - Comprehensive guides
-- [Examples](https://python.useinstructor.com/examples/) - Copy-paste recipes
-- [Blog](https://python.useinstructor.com/blog/) - Tutorials and best practices
-- [Discord](https://discord.gg/bD9YE9JArw) - Get help from the community
+We welcome contributions of all sizes.
-## Why use Instructor over alternatives?
+1. Read `AGENT.md` (and `CLAUDE.md` if you use Claude) for repository conventions.
+2. Fork the repo, then create a feature branch from `main` (`git checkout -b feat/`).
+3. Set up your environment with `uv`:
-**vs Raw JSON mode**: Instructor provides automatic validation, retries, streaming, and nested object support. No manual schema writing.
+ ```bash
+ uv venv
+ source .venv/bin/activate
+ uv pip install -e ".[dev,anthropic]"
+ uv run ruff check instructor examples tests
+ uv run ruff format instructor examples tests
+ uv run ty check
+ uv run pytest tests/ -k "not llm and not openai"
+ ```
-**vs LangChain/LlamaIndex**: Instructor is focused on one thing - structured extraction. It's lighter, faster, and easier to debug.
+4. Keep commits focused, follow the `type(scope): subject` conventional commit format, and open a PR using the templates in `.github/`.
+5. For large features, discuss your plan in an issue or on [Discord](https://discord.gg/bD9YE9JArw) before writing a lot of code.
-**vs Custom solutions**: Battle-tested by thousands of developers. Handles edge cases you haven't thought of yet.
+Check the [good first issues](https://github.com/instructor-ai/instructor/labels/good%20first%20issue) label if you want a scoped task.
-## Contributing
+## License
-We welcome contributions! Check out our [good first issues](https://github.com/instructor-ai/instructor/labels/good%20first%20issue) to get started.
+Instructor is released under the [MIT License](https://github.com/instructor-ai/instructor/blob/main/LICENSE).
-## License
+## Community
-MIT License - see [LICENSE](https://github.com/instructor-ai/instructor/blob/main/LICENSE) for details.
+- Join the [Discord server](https://discord.gg/bD9YE9JArw) for help, office hours, and release notes.
+- Follow [@jxnlco on Twitter](https://twitter.com/jxnlco) for project updates.
+- Watch or star the [GitHub repo](https://github.com/instructor-ai/instructor) to get notified about new releases.
+- Share what you build—tag your tutorials, demos, or talks so we can highlight them in the blog.
---
-Built by the Instructor community. Special thanks to Jason Liu and all contributors.
+Built by the Instructor community. Special thanks to Jason Liu and all contributors.
\ No newline at end of file