Stateful Tools
How tools accept configuration that the AI agent never sees.
The two kinds of arguments
Every tool function has two kinds of arguments:
Runtime fields
What the AI agent fills in when calling the tool. Visible to the LLM and shown in the MCP schema. Examples: star_name, relative_path, planet_params.
State fields
Configuration injected by the toolkit. Invisible to the LLM and to MCP. Examples: base_directory, opacity_path, api_key.
Why bother distinguishing? Because the agent shouldn't have to know about base_directory. That's a property of how the toolkit is installed, not something the agent should reason about. State fields keep configuration out of the agent's context window.
Declaring state fields
Pass state=[...] to @define_tool. Names in the list must match real parameters of your function:
from orchestral import define_tool
from pathlib import Path
import json
@define_tool(state=["base_directory"])
def write_file(
relative_path: str,
content: str,
base_directory: Path,
) -> str:
"""Write a file to the workspace."""
full_path = base_directory / relative_path
full_path.parent.mkdir(parents=True, exist_ok=True)
full_path.write_text(content)
return json.dumps({"written": str(full_path)})pythonFrom the agent's perspective, the schema is just:
{
"name": "write_file",
"parameters": {
"relative_path": { "type": "string" },
"content": { "type": "string" }
}
}jsonbase_directory is filled in by SciToolkit at serve time, from the toolkit's stored configuration.
How the value gets there
- User installs the toolkit. SciToolkit reads
toolkit.yaml, collects values declared underconfig, and runssetup.pyif present. - All collected values are stored in
~/.scitoolkit/toolkits/<name>/.env. - User runs
scitoolkit serve. SciToolkit loads each toolkit's.env. - For each tool with state fields, SciToolkit injects matching values from the config dict. The injection happens once, at startup, on a copy of the tool — so different toolkits with overlapping field names don't collide.
- The MCP server registers the tool. The agent sees only the runtime fields.
- When the agent calls the tool, the state fields are already populated. Your tool function gets called with both runtime args (from the agent) and state args (from config) merged into its signature.
Connecting state fields to config
State field names must match keys in your toolkit's config. So if you declare:
# toolkit.yaml
config:
base_directory:
type: path
description: "Workspace for outputs"
default: "~/my-toolkit-workspace"yaml…then your tool can use it:
@define_tool(state=["base_directory"])
def write_file(relative_path: str, content: str, base_directory: Path) -> str:
...pythonThe names must match exactly. SciToolkit looks up base_directory in the toolkit's config dict and injects it.
Multiple state fields
List as many as you need. They'll all be injected:
@define_tool(state=["opacity_path", "base_directory", "max_workers"])
def run_forward_model(
planet_params: dict,
opacity_path: Path,
base_directory: Path,
max_workers: int,
) -> str:
"""
Run an exoplanet atmosphere forward model.
Args:
planet_params: Planet parameters from the agent.
"""
# opacity_path, base_directory, max_workers are filled in by SciToolkit
result = forward_model(
planet_params,
opacity_path=opacity_path,
output_dir=base_directory,
n_workers=max_workers,
)
return json.dumps(result)pythonWhat if a state field is missing?
If you declare a state field that has no default value (no = in the function signature) and the toolkit's config doesn't provide a value, SciToolkit refuses to register the tool — it fails fast at serve startup with a clear error pointing to the missing config.
To make a state field optional, give it a Python default:
@define_tool(state=["max_workers"])
def my_tool(query: str, max_workers: int = 4) -> str:
"""Run a query, optionally in parallel."""
...pythonNow if max_workers isn't in config, the tool runs with the default value of 4.
When you need more than this
State fields are a Python value injected once at serve startup. The decorator approach has one significant constraint: the values are deep-copied when each tool is bound to an agent. That works for paths, integers, dicts, simple objects — but not for things like:
- Open file handles
- Database connections
- Locks, queues, threads
- Anything not pickleable
For those cases, use Orchestral's class-based BaseTool pattern, which lets you instantiate the tool with whatever resources you need and bind them as instance attributes. See the Orchestral documentation for details.
A real example: ASTER
Here's a typical ASTER tool. The agent calls it with planet parameters; opacity path, workspace, and worker count are all managed by the toolkit:
from orchestral import define_tool
from pathlib import Path
import json
@define_tool(state=["opacity_path", "base_directory", "max_workers"])
def transit_spectrum(
planet_params: dict,
opacity_path: Path,
base_directory: Path,
max_workers: int = 4,
) -> str:
"""
Compute a transit transmission spectrum.
Args:
planet_params: Dict with keys: radius, mass, period, T_eq.
Provided by the agent at call time.
"""
# The agent only knows about planet_params.
# opacity_path, base_directory, max_workers come from toolkit config —
# set once at install time, not per-call.
spectrum = compute_spectrum(
planet_params,
opacities=load_opacities(opacity_path),
n_workers=max_workers,
)
output = base_directory / "spectra" / f"{planet_params['name']}.npz"
spectrum.save(output)
return json.dumps({"spectrum_file": str(output), "wavelengths_nm": len(spectrum)})pythonThis tool is just as easy for the agent to use as a stateless tool. The complexity of opacity files, parallelism, and output paths is invisible.
Next
See Recipes for copy-paste examples, including the full ASTER setup as a reference.