29.17%
Unchecked parse/plan success from the checked-in benchmark snapshot.
StagePilot is the canonical public surface for parser hardening, bounded retry orchestration, and benchmark-backed reviewer proof. This Pages site is intentionally static and free to host; the live API and orchestration runtime remain a separate service concern.
Unchecked parse/plan success from the checked-in benchmark snapshot.
Schema-safe parser middleware recovers malformed tool outputs.
One bounded retry closes the remaining gap in the current benchmark set.
KIM3310/stage-pilot