-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Description
Tip
tldr²: before passing the whole unfiltered/unprocessed tool's output to llm, I (or the llm itself) might want to do some preprocessing
Tip
TL;DR: Consider the tool multiply(a:int, b:int).
- Problem: The tool returning only text (e.g.,
"Product of 5 and 3 is 15") makes it hard (therefore fragile or expensive) to write tool wrappers that process tool's output (when more processing is needed) before passing it to LLM - Proposal: Mandate structured results (e.g.,
{"product": 15}) for reliable and efficient processing. - Ideal: Provide both the required structured data and an optional
text_summary(e.g.,"Product of 5 and 3 is 15") for flexibility.
This proposal is part of a series. Check out the rest here.
Introduction
First off, thanks for the excellent work on the 12-Factor Agents guide! It provides a really valuable set of principles for building robust and maintainable agentic systems.
I wanted to raise a point for discussion regarding the output format of the tools themselves (i.e., the deterministic code that gets executed based on the LLM's structured output).
Context
- Factor 4 ("Tools are just structured outputs") clearly and correctly emphasizes that the LLM's output (the decision to call a tool with specific parameters) should be structured data (e.g., JSON).
- Factor 3 ("Own your context window") demonstrates building context from events, and the examples implicitly show tool results being added back as structured data (e.g.,
<list_git_tags_result>containing structured YAML/JSON).
The Gap / Suggestion
While the use of structured results is implied in Factor 3, the guide doesn't seem to explicitly state that tools should fundamentally return structured data as their primary, canonical result. It could also clarify the role of optional, pre-formatted human-readable strings.
I propose that the guide explicitly recommend:
- Tool results must include a structured data component (e.g., a JSON object/dict).
- Tools may optionally include a pre-formatted, human-readable string summary alongside the structured data.
This ensures robustness while still allowing for convenience where appropriate.
Reasoning:
- Entropy Principle: It's always easier to increase entropy (format structured data into a human-readable string) than to decrease it (reliably parse a string back into structured data). Mandating the structured base ensures the low-entropy, reliable data is always available. If a tool only returns
"Product of 5 and 3 is 15", parsing is brittle. If it must return at least{"structured_result": {"product": 15}}, any consumer can easily extract the value. The optional string is then just a bonus. - Robustness & Flexibility: Structured results are easier and more reliable for the agent's control flow logic (Factor 8) or subsequent LLM reasoning steps to consume. The consumer decides how to use/present the data. An optional pre-formatted string can be provided for convenience (e.g., direct LLM consumption in simple cases or basic logging), but the structured data remains the canonical result.
- Alignment with Factor 3 (Own Your Context Window): Manipulating and reasoning over the context history is much cleaner and more reliable when the canonical results of tool calls within that history are structured and predictable. Parsing strings within the context adds unnecessary complexity and fragility.
- Alignment with Factor 9 (Compact Errors): Standardizing on structured results naturally extends to standardizing structured error results from tools, making error handling and recovery more robust.
Examples:
Consider a simple multiply tool:
- Tool Call Triggered (Factor 4): LLM outputs
{"intent": "multiply", "a": 5, "b": 3} - Less Ideal Result (String Only): Tool returns
"The product of 5 and 3 is 15"- Problem: Framework/LLM must parse this string to get the numerical result reliably. Fragile. Violates the proposed principle.
- Good Result (Structured Only): Tool returns
{"structured_result": {"product": 15}}- Benefit: Trivial to extract
.structured_result.product. Reliable. Complies with the mandatory part.
- Benefit: Trivial to extract
- Ideal Result (Structured + Optional Text): Tool returns
{"structured_result": {"product": 15}, "text_summary": "The product of 5 and 3 is 15"}- Benefit: Provides the reliable structured data for programmatic use (Factor 3, Factor 8) and a convenient pre-formatted string for simple LLM injection or logging, without sacrificing robustness. Fully compliant and flexible.
(This mirrors the distinction seen sometimes in MCP tools, where structured output is preferable to pre-formatted strings.)
Proposed Action:
Perhaps this could be addressed by:
- Adding a clarifying sentence or paragraph to Factor 3 or Factor 4 emphasizing that the fundamental result should be structured.
- Adding a brief new factor (maybe Factor 4b or an appendix note) specifically addressing the requirement for structured tool results and the role of optional human-readable summaries (similar to the purpose of
llms.txt).
Explicitly advocating for mandatory structured tool results (with optional text summaries) seems like a natural extension of the principles already laid out and would help developers avoid the pitfalls of string-only results when building their agent's tools.
Would love to hear your thoughts on this!
Thanks again for the great resource.