StagePilot @ai-sdk-tool/parser

Reliable tool calling for non-native models.

StagePilot is the canonical public surface for parser hardening, bounded retry orchestration, and benchmark-backed reviewer proof. This Pages site is intentionally static and free to host; the live API and orchestration runtime remain a separate service concern.

Baseline
29.17%

Unchecked parse/plan success from the checked-in benchmark snapshot.

Middleware
87.50%

Schema-safe parser middleware recovers malformed tool outputs.

Ralph Loop
100.00%

One bounded retry closes the remaining gap in the current benchmark set.

What this repo proves

  • Parser middleware can make loose tool-call text safe enough for real workflows.
  • Reliability claims are tied to checked-in benchmark artifacts, not vague anecdotes.
  • Operator review surfaces and developer-ops lanes can be documented separately from the core parser package.
  • A free static site can still explain trust boundaries, benchmark lift, and adoption posture.

Why this Pages version exists

  • Free and static: easy to keep online without paying for backend uptime.
  • Reviewer-first: benchmark, docs, and proof assets stay readable without running the API.
  • Service-ready later: the full runtime still maps naturally to Cloud Run or another API host when needed.

Current deployment posture

  • Frontend: this static Pages microsite
  • Backend: not hosted on Pages by design
  • Recommended live runtime: Cloud Run or equivalent API host
  • Canonical repo: KIM3310/stage-pilot