Agent Experiments

Agent experiments are project-local workspaces for outer-loop experimentation. They let an agent create parameter variants, isolated file variants, trial runs, comparisons, and workflow condition checks without changing BioWorld simulation semantics.

Experiment state is stored under the active lab root:

<lab-root>/.biosimulant/
  experiments/
    <experiment_id>/
      experiment.json
      events.jsonl
      variants/
      trials/
      artifacts/

The .biosimulant/ folder is local metadata. It is excluded from lab packaging, package hashing, and remote staging, so portable lab packages continue to contain only the actual lab.

Files

experiment.json stores the current experiment record: objective, hypothesis, description, rationale, budget, allowed actions, workflow, status, base lab revision, and timestamps. The objective is the measurable goal; the hypothesis, description, and rationale explain why this experiment exists and how it differs from nearby attempts.

events.jsonl is an append-only audit log. It records experiment creation, variant creation, trial creation, trial refreshes, and stop events. If experiment.json is missing, the desktop can rebuild it from the event log.

variants/<variant_id>/variant.json stores a baseline, parameter, or workspace variant.

trials/<trial_id>.json stores the run spec, local run id, remote run id, execution target, status, derived metrics, and errors.

artifacts/ is reserved for derived summaries, reports, plots, or comparison outputs.

Every manual or agent-created experiment should carry a clear hypothesis, description, or rationale. The experiment id is only an address; it should not be the only thing that distinguishes two experiment folders.

Variant Types

baseline variants capture the current lab revision and run the lab unchanged.

parameter variants store overlays for runtime settings, input values, or module parameters. They do not copy files and are the default for tuning.

workspace variants copy the lab into the experiment folder. The agent edits only that isolated copy. A winning workspace variant must be promoted through the normal reviewable proposal flow before it touches the real lab.

Trial Runs

Trials use the existing desktop run infrastructure. Local trials create normal local run rows. Remote trials stage the selected lab or isolated workspace and create normal Hub remote runs with deterministic client request IDs.

The experiment service refreshes trial files from the run database. Completed or failed runs update the matching trial status, error field, and derived metrics. The experiment folder remains the inspectable source of truth for the agent workflow, while the run database remains the source of truth for normal run execution.

Promotion

Workspace promotion previews compare the isolated workspace against the real lab root while ignoring .biosimulant/, generated caches, and package/build artifacts. The preview records changed paths, change kind, base hash, variant hash, and text diffs where possible. Applying a promotion re-checks base hashes first; stale files block the apply.

Parameter promotion is intentionally narrow in v1. It supports simulation_config.duration, communication_step, settle_steps, run input overlays, and parameters.<node_alias> overlays. Unsupported overlay keys block promotion with an explicit “not promotable yet” reason.

Conditions

Experiment workflow conditionals use a small safe expression language. They are for deciding what the agent should do between runs, not for changing BioWorld scheduling inside a run.

Supported expression features:

Dotted JSON paths such as metrics.best_affinity
Comparisons: <, <=, >, >=, ==, !=
Boolean logic: and, or, not
Built-ins: exists(...), count(...), all(...), any(...)

Examples:

metrics.best_affinity < -12.0
completed_trials >= 5 or budget.remaining_credits < 10
exists(metrics.score)
count(trials) >= 3

Workflow conditions are evaluated against experiment status, budget, variants, trials, run summaries, and derived metrics.

Simulation Boundaries

BioWorld remains a logical-time simulator. Models may branch internally, emit control signals, and react to committed state across communication boundaries. The experiment workflow layer decides whether to run another trial, generate more variants, stop, compare, or ask for approval.

The graph scheduler does not dynamically skip arbitrary modules for agent workflow purposes. Keeping workflow control outside the simulator preserves deterministic co-simulation behavior and keeps remote execution unchanged.

Agent Tools

The desktop agent uses these tools for experiment workflows:

desktop_start_experiment
desktop_get_experiment_status
desktop_propose_experiment_variant
desktop_run_experiment_trials
desktop_compare_experiment_trials
desktop_promote_experiment_variant
desktop_stop_experiment

Promotion is review-gated. The desktop manager builds a preview first, then applies only after explicit user confirmation and safety checks.

Packaging & Distribution Overview