Stateful Tools

How tools accept configuration that the AI agent never sees.

The two kinds of arguments

Every tool function has two kinds of arguments:

Runtime fields

What the AI agent fills in when calling the tool. Visible to the LLM and shown in the MCP schema. Examples: star_name, relative_path, planet_params.

State fields

Configuration injected by the toolkit. Invisible to the LLM and to MCP. Examples: base_directory, opacity_path, api_key.

Why bother distinguishing? Because the agent shouldn't have to know about base_directory. That's a property of how the toolkit is installed, not something the agent should reason about. State fields keep configuration out of the agent's context window.

Declaring state fields

Pass state=[...] to @define_tool. Names in the list must match real parameters of your function:

from orchestral import define_tool
from pathlib import Path
import json

@define_tool(state=["base_directory"])
def write_file(
    relative_path: str,
    content: str,
    base_directory: Path,
) -> str:
    """Write a file to the workspace."""
    full_path = base_directory / relative_path
    full_path.parent.mkdir(parents=True, exist_ok=True)
    full_path.write_text(content)
    return json.dumps({"written": str(full_path)})
python

From the agent's perspective, the schema is just:

{
  "name": "write_file",
  "parameters": {
    "relative_path": { "type": "string" },
    "content":       { "type": "string" }
  }
}
json

base_directory is filled in by SciToolkit at serve time, from the toolkit's stored configuration.

How the value gets there

  1. User installs the toolkit. SciToolkit reads toolkit.yaml, collects values declared under config, and runs setup.py if present.
  2. All collected values are stored in ~/.scitoolkit/toolkits/<name>/.env.
  3. User runs scitoolkit serve. SciToolkit loads each toolkit's .env.
  4. For each tool with state fields, SciToolkit injects matching values from the config dict. The injection happens once, at startup, on a copy of the tool — so different toolkits with overlapping field names don't collide.
  5. The MCP server registers the tool. The agent sees only the runtime fields.
  6. When the agent calls the tool, the state fields are already populated. Your tool function gets called with both runtime args (from the agent) and state args (from config) merged into its signature.

Connecting state fields to config

State field names must match keys in your toolkit's config. So if you declare:

# toolkit.yaml
config:
  base_directory:
    type: path
    description: "Workspace for outputs"
    default: "~/my-toolkit-workspace"
yaml

…then your tool can use it:

@define_tool(state=["base_directory"])
def write_file(relative_path: str, content: str, base_directory: Path) -> str:
    ...
python

The names must match exactly. SciToolkit looks up base_directory in the toolkit's config dict and injects it.

Multiple state fields

List as many as you need. They'll all be injected:

@define_tool(state=["opacity_path", "base_directory", "max_workers"])
def run_forward_model(
    planet_params: dict,
    opacity_path: Path,
    base_directory: Path,
    max_workers: int,
) -> str:
    """
    Run an exoplanet atmosphere forward model.

    Args:
        planet_params: Planet parameters from the agent.
    """
    # opacity_path, base_directory, max_workers are filled in by SciToolkit
    result = forward_model(
        planet_params,
        opacity_path=opacity_path,
        output_dir=base_directory,
        n_workers=max_workers,
    )
    return json.dumps(result)
python

What if a state field is missing?

If you declare a state field that has no default value (no = in the function signature) and the toolkit's config doesn't provide a value, SciToolkit refuses to register the tool — it fails fast at serve startup with a clear error pointing to the missing config.

To make a state field optional, give it a Python default:

@define_tool(state=["max_workers"])
def my_tool(query: str, max_workers: int = 4) -> str:
    """Run a query, optionally in parallel."""
    ...
python

Now if max_workers isn't in config, the tool runs with the default value of 4.

When you need more than this

State fields are a Python value injected once at serve startup. The decorator approach has one significant constraint: the values are deep-copied when each tool is bound to an agent. That works for paths, integers, dicts, simple objects — but not for things like:

  • Open file handles
  • Database connections
  • Locks, queues, threads
  • Anything not pickleable

For those cases, use Orchestral's class-based BaseTool pattern, which lets you instantiate the tool with whatever resources you need and bind them as instance attributes. See the Orchestral documentation for details.

A real example: ASTER

Here's a typical ASTER tool. The agent calls it with planet parameters; opacity path, workspace, and worker count are all managed by the toolkit:

from orchestral import define_tool
from pathlib import Path
import json

@define_tool(state=["opacity_path", "base_directory", "max_workers"])
def transit_spectrum(
    planet_params: dict,
    opacity_path: Path,
    base_directory: Path,
    max_workers: int = 4,
) -> str:
    """
    Compute a transit transmission spectrum.

    Args:
        planet_params: Dict with keys: radius, mass, period, T_eq.
                       Provided by the agent at call time.
    """
    # The agent only knows about planet_params.
    # opacity_path, base_directory, max_workers come from toolkit config —
    # set once at install time, not per-call.
    spectrum = compute_spectrum(
        planet_params,
        opacities=load_opacities(opacity_path),
        n_workers=max_workers,
    )
    output = base_directory / "spectra" / f"{planet_params['name']}.npz"
    spectrum.save(output)
    return json.dumps({"spectrum_file": str(output), "wavelengths_nm": len(spectrum)})
python

This tool is just as easy for the agent to use as a stateless tool. The complexity of opacity files, parallelism, and output paths is invisible.

Next

See Recipes for copy-paste examples, including the full ASTER setup as a reference.