From 3ae75773684cfed51c27063fb7894b170474ef4a Mon Sep 17 00:00:00 2001
From: bft-codebot <bft-codebot@users.noreply.github.com>
Date: Fri, 6 Feb 2026 18:56:55 +0000
Subject: [PATCH] sync(bfmono): feat(gambit): add tool-call-aware grader
 schemas and root-deck guards (+19 more) (bfmono@70ec3b942)

This PR is an automated gambitmono sync of bfmono Gambit packages.

- Source: `packages/gambit/`
- Core: `packages/gambit-core/`
- bfmono rev: 70ec3b942

Changes:
- 70ec3b942 feat(gambit): add tool-call-aware grader schemas and root-deck guards
- cae381f00 feat(gambit): align scaffolds with product command and hourglass policies
- 5faa48b35 feat(gambit): move bot policy to folder and enforce policy summarizer flow
- 9a36c4a7e fix(gambit): align env loading with init and block .gambit env writes
- dbe7c54ca feat(gambit-bot): add file actions and scenario deck structure
- 855784d6b docs(gambit): add public permissions guide and API jsdoc
- 8f0ca0a85 feat(gambit): trace effective permission layers at runtime
- 90b4b5071 feat(gambit-core): add phase-1 permission contract primitives
- df9280f6a fix(gambit): restore build-bot deck path compatibility
- daca46555 feat(simulator-ui): wire build, test, and grade to workspace sessions
- e404a17d7 feat(gambit): add workspace-backed serve and bot sandbox flow
- 5f4fa86b9 feat(gambit): scaffold workspace defaults in init
- cf9b23778 feat(gambit-core): add schema guards and model param passthrough
- d0e5a9617 [gambit] move chat message into transcript so it scrolls
- 5c6125d99 feat(simulator-ui): open workbench drawer by default
- 7c9cd05f8 feat(simulator): gate chat accordion by env flag
- a2599068e feat(simulator-ui): add build chat history loading
- 9911dae22 feat(simulator-ui): add workbench chat drawer accordion
- 8cab8ec1f feat(simulator-ui): dock calibrate drawer and sync updates
- d41ba101d Add AAR for phase 3.1.5 deck format build tab

Do not edit this repo directly; make changes in bfmono and re-run the sync.
---
 docs/external/guides/authoring.md             |  5 +
 .../graders/contexts/conversation_tools.ts    | 27 ++++++
 .../contexts/conversation_tools.zod.ts        |  1 +
 .../schemas/graders/contexts/tools.ts         |  5 +
 .../schemas/graders/contexts/tools.zod.ts     |  1 +
 .../schemas/graders/contexts/turn_tools.ts    | 28 ++++++
 .../graders/contexts/turn_tools.zod.ts        |  1 +
 packages/gambit-core/src/markdown.test.ts     | 95 ++++++++++++++++++
 src/decks/gambit-bot/PROMPT.md                | 35 +++----
 .../first_deck_root_prompt_guard/PROMPT.md    |  9 ++
 .../first_deck_root_prompt_guard.deck.ts      | 94 ++++++++++++++++++
 .../PROMPT.md                                 | 10 ++
 ...first_deck_root_prompt_guard_tools.deck.ts | 97 +++++++++++++++++++
 .../PROMPT.md                                 | 10 ++
 ...ot_prompt_guard_tools_conversation.deck.ts | 97 +++++++++++++++++++
 .../gambit-bot/policy/deck-format-1.0.md      | 20 ++--
 .../scenarios/build_tab_demo/PROMPT.md        | 40 --------
 .../scenarios/faq_bot_build_flow/PROMPT.md    | 52 ++++++++++
 .../investor_faq_regression/PROMPT.md         | 54 -----------
 .../scenarios/nux_from_scratch_demo/PROMPT.md | 27 ------
 .../scenarios/recipe_selection/PROMPT.md      | 33 -------
 .../recipe_selection_no_skip/PROMPT.md        | 27 ------
 .../nux_from_scratch_demo_input.zod.ts        |  7 --
 src/decks/tests/build_tab_demo.test.deck.md   | 40 --------
 .../tests/nux_from_scratch_demo.test.deck.md  | 27 ------
 src/decks/tests/recipe_selection.test.deck.md | 33 -------
 .../recipe_selection_no_skip.test.deck.md     | 27 ------
 27 files changed, 559 insertions(+), 343 deletions(-)
 create mode 100644 packages/gambit-core/schemas/graders/contexts/conversation_tools.ts
 create mode 100644 packages/gambit-core/schemas/graders/contexts/conversation_tools.zod.ts
 create mode 100644 packages/gambit-core/schemas/graders/contexts/tools.ts
 create mode 100644 packages/gambit-core/schemas/graders/contexts/tools.zod.ts
 create mode 100644 packages/gambit-core/schemas/graders/contexts/turn_tools.ts
 create mode 100644 packages/gambit-core/schemas/graders/contexts/turn_tools.zod.ts
 create mode 100644 src/decks/gambit-bot/graders/first_deck_root_prompt_guard/PROMPT.md
 create mode 100644 src/decks/gambit-bot/graders/first_deck_root_prompt_guard/first_deck_root_prompt_guard.deck.ts
 create mode 100644 src/decks/gambit-bot/graders/first_deck_root_prompt_guard_tools/PROMPT.md
 create mode 100644 src/decks/gambit-bot/graders/first_deck_root_prompt_guard_tools/first_deck_root_prompt_guard_tools.deck.ts
 create mode 100644 src/decks/gambit-bot/graders/first_deck_root_prompt_guard_tools_conversation/PROMPT.md
 create mode 100644 src/decks/gambit-bot/graders/first_deck_root_prompt_guard_tools_conversation/first_deck_root_prompt_guard_tools_conversation.deck.ts
 delete mode 100644 src/decks/gambit-bot/scenarios/build_tab_demo/PROMPT.md
 create mode 100644 src/decks/gambit-bot/scenarios/faq_bot_build_flow/PROMPT.md
 delete mode 100644 src/decks/gambit-bot/scenarios/investor_faq_regression/PROMPT.md
 delete mode 100644 src/decks/gambit-bot/scenarios/nux_from_scratch_demo/PROMPT.md
 delete mode 100644 src/decks/gambit-bot/scenarios/recipe_selection/PROMPT.md
 delete mode 100644 src/decks/gambit-bot/scenarios/recipe_selection_no_skip/PROMPT.md
 delete mode 100644 src/decks/gambit-bot/scenarios/schemas/nux_from_scratch_demo_input.zod.ts
 delete mode 100644 src/decks/tests/build_tab_demo.test.deck.md
 delete mode 100644 src/decks/tests/nux_from_scratch_demo.test.deck.md
 delete mode 100644 src/decks/tests/recipe_selection.test.deck.md
 delete mode 100644 src/decks/tests/recipe_selection_no_skip.test.deck.md

diff --git a/docs/external/guides/authoring.md b/docs/external/guides/authoring.md
index 8abb0264..9f1790c1 100644
--- a/docs/external/guides/authoring.md
+++ b/docs/external/guides/authoring.md
@@ -124,6 +124,11 @@ deno run -A packages/gambit/scripts/migrate-schema-terms.ts <repo-root>
   invalid JSON or schema-violating output blocks the run with a clear error.
 - `graderDecks` describe calibration decks that score transcripts/artifacts. The
   simulator Calibrate page will run these decks against stored runs.
+- For graders that inspect assistant tool usage, set
+  `contextSchema = "gambit://schemas/graders/contexts/turn_tools.zod.ts"` so
+  `session.messages[*].tool_calls` is available in the grader input.
+- For conversation-level tool-call grading (single score for the whole run), use
+  `contextSchema = "gambit://schemas/graders/contexts/conversation_tools.zod.ts"`.
 - Configure `acceptsUserTurns` alongside these references:
   - Markdown roots default to `true`; TypeScript decks default to `false`
     everywhere. Set it to `false` for any workflow deck that should never accept
diff --git a/packages/gambit-core/schemas/graders/contexts/conversation_tools.ts b/packages/gambit-core/schemas/graders/contexts/conversation_tools.ts
new file mode 100644
index 00000000..a525e424
--- /dev/null
+++ b/packages/gambit-core/schemas/graders/contexts/conversation_tools.ts
@@ -0,0 +1,27 @@
+import { z } from "zod";
+
+const graderToolCallSchema = z.object({
+  id: z.string().optional(),
+  type: z.string().optional(),
+  function: z.object({
+    name: z.string(),
+    arguments: z.string().optional(),
+  }),
+});
+
+export const graderConversationMessageWithToolsSchema = z.object({
+  role: z.string(),
+  content: z.any().optional(),
+  name: z.string().optional(),
+  tool_calls: z.array(graderToolCallSchema).optional(),
+});
+
+export const graderConversationWithToolsSchema = z.object({
+  messages: z.array(graderConversationMessageWithToolsSchema).optional(),
+  meta: z.record(z.any()).optional(),
+  notes: z.object({ text: z.string().optional() }).optional(),
+});
+
+export default z.object({
+  session: graderConversationWithToolsSchema,
+});
diff --git a/packages/gambit-core/schemas/graders/contexts/conversation_tools.zod.ts b/packages/gambit-core/schemas/graders/contexts/conversation_tools.zod.ts
new file mode 100644
index 00000000..4de338ec
--- /dev/null
+++ b/packages/gambit-core/schemas/graders/contexts/conversation_tools.zod.ts
@@ -0,0 +1 @@
+export { default } from "./conversation_tools.ts";
diff --git a/packages/gambit-core/schemas/graders/contexts/tools.ts b/packages/gambit-core/schemas/graders/contexts/tools.ts
new file mode 100644
index 00000000..2741efcb
--- /dev/null
+++ b/packages/gambit-core/schemas/graders/contexts/tools.ts
@@ -0,0 +1,5 @@
+export { default } from "./turn_tools.ts";
+export {
+  graderConversationWithToolsSchema,
+  graderMessageWithToolsSchema,
+} from "./turn_tools.ts";
diff --git a/packages/gambit-core/schemas/graders/contexts/tools.zod.ts b/packages/gambit-core/schemas/graders/contexts/tools.zod.ts
new file mode 100644
index 00000000..72849e70
--- /dev/null
+++ b/packages/gambit-core/schemas/graders/contexts/tools.zod.ts
@@ -0,0 +1 @@
+export { default } from "./turn_tools.ts";
diff --git a/packages/gambit-core/schemas/graders/contexts/turn_tools.ts b/packages/gambit-core/schemas/graders/contexts/turn_tools.ts
new file mode 100644
index 00000000..50b0e8f3
--- /dev/null
+++ b/packages/gambit-core/schemas/graders/contexts/turn_tools.ts
@@ -0,0 +1,28 @@
+import { z } from "zod";
+
+const graderToolCallSchema = z.object({
+  id: z.string().optional(),
+  type: z.string().optional(),
+  function: z.object({
+    name: z.string(),
+    arguments: z.string().optional(),
+  }),
+});
+
+export const graderMessageWithToolsSchema = z.object({
+  role: z.string(),
+  content: z.any().optional(),
+  name: z.string().optional(),
+  tool_calls: z.array(graderToolCallSchema).optional(),
+});
+
+export const graderConversationWithToolsSchema = z.object({
+  messages: z.array(graderMessageWithToolsSchema).optional(),
+  meta: z.record(z.any()).optional(),
+  notes: z.object({ text: z.string().optional() }).optional(),
+});
+
+export default z.object({
+  session: graderConversationWithToolsSchema,
+  messageToGrade: graderMessageWithToolsSchema,
+});
diff --git a/packages/gambit-core/schemas/graders/contexts/turn_tools.zod.ts b/packages/gambit-core/schemas/graders/contexts/turn_tools.zod.ts
new file mode 100644
index 00000000..72849e70
--- /dev/null
+++ b/packages/gambit-core/schemas/graders/contexts/turn_tools.zod.ts
@@ -0,0 +1 @@
+export { default } from "./turn_tools.ts";
diff --git a/packages/gambit-core/src/markdown.test.ts b/packages/gambit-core/src/markdown.test.ts
index 41242596..d813430b 100644
--- a/packages/gambit-core/src/markdown.test.ts
+++ b/packages/gambit-core/src/markdown.test.ts
@@ -101,6 +101,101 @@ Schema deck.
   assertEquals(parsed, { status: 200 });
 });
 
+Deno.test("markdown deck resolves tool-call-aware grader context schema", async () => {
+  const dir = await Deno.makeTempDir();
+
+  const deckPath = await writeTempDeck(
+    dir,
+    "turn-tools-schema.deck.md",
+    `+++
+label = "turn-tools-schema"
+contextSchema = "gambit://schemas/graders/contexts/turn_tools.zod.ts"
++++
+
+Schema deck.
+`,
+  );
+
+  const deck = await loadMarkdownDeck(deckPath);
+
+  assert(deck.contextSchema, "expected context schema to resolve");
+  const parsed = deck.contextSchema.parse({
+    session: {
+      messages: [
+        {
+          role: "assistant",
+          tool_calls: [
+            {
+              function: {
+                name: "bot_write",
+                arguments: '{"path":"PROMPT.md"}',
+              },
+            },
+          ],
+        },
+      ],
+    },
+    messageToGrade: {
+      role: "assistant",
+      tool_calls: [
+        {
+          function: {
+            name: "bot_write",
+          },
+        },
+      ],
+    },
+  });
+
+  assertEquals(parsed.messageToGrade.role, "assistant");
+  assertEquals(
+    parsed.session.messages?.[0].tool_calls?.[0].function.name,
+    "bot_write",
+  );
+});
+
+Deno.test("markdown deck resolves conversation-level tool-call grader context schema", async () => {
+  const dir = await Deno.makeTempDir();
+
+  const deckPath = await writeTempDeck(
+    dir,
+    "conversation-tools-schema.deck.md",
+    `+++
+label = "conversation-tools-schema"
+contextSchema = "gambit://schemas/graders/contexts/conversation_tools.zod.ts"
++++
+
+Schema deck.
+`,
+  );
+
+  const deck = await loadMarkdownDeck(deckPath);
+
+  assert(deck.contextSchema, "expected context schema to resolve");
+  const parsed = deck.contextSchema.parse({
+    session: {
+      messages: [
+        {
+          role: "assistant",
+          tool_calls: [
+            {
+              function: {
+                name: "bot_write",
+                arguments: '{"path":"faq-bot/PROMPT.md"}',
+              },
+            },
+          ],
+        },
+      ],
+    },
+  });
+
+  assertEquals(
+    parsed.session.messages?.[0].tool_calls?.[0].function.name,
+    "bot_write",
+  );
+});
+
 Deno.test("markdown deck warns on legacy schema URIs", async () => {
   const dir = await Deno.makeTempDir();
   const deckPath = await writeTempDeck(
diff --git a/src/decks/gambit-bot/PROMPT.md b/src/decks/gambit-bot/PROMPT.md
index 21d458bb..33b9179c 100644
--- a/src/decks/gambit-bot/PROMPT.md
+++ b/src/decks/gambit-bot/PROMPT.md
@@ -55,30 +55,25 @@ label = "Deck format policy guard (turn) LLM"
 path = "./graders/deck_format_policy_llm/PROMPT.md"
 description = "LLM guard for policy-compliant deck editing behavior."
 
-[[scenarios]]
-label = "Recipe selection on-ramp tester"
-path = "./scenarios/recipe_selection/PROMPT.md"
-description = "Synthetic user that asks Gambit Bot to build a recipe selection chatbot."
-
-[[scenarios]]
-label = "Recipe selection (no skip)"
-path = "./scenarios/recipe_selection_no_skip/PROMPT.md"
-description = "Synthetic user that completes the question flow without skipping to building."
+[[graders]]
+label = "First deck location guard (turn)"
+path = "./graders/first_deck_root_prompt_guard/PROMPT.md"
+description = "Checks that the first created deck is root PROMPT.md (not a subfolder PROMPT.md)."
 
-[[scenarios]]
-label = "Build tab demo prompt"
-path = "./scenarios/build_tab_demo/PROMPT.md"
-description = "Synthetic user prompt for the build tab demo."
+[[graders]]
+label = "First deck location guard (tools)"
+path = "./graders/first_deck_root_prompt_guard_tools/PROMPT.md"
+description = "Checks first created deck location using tool-call-aware grading context."
 
-[[scenarios]]
-label = "NUX from scratch demo prompt"
-path = "./scenarios/nux_from_scratch_demo/PROMPT.md"
-description = "Synthetic user prompt for the NUX from-scratch build demo."
+[[graders]]
+label = "First deck location guard (tools, conversation)"
+path = "./graders/first_deck_root_prompt_guard_tools_conversation/PROMPT.md"
+description = "Conversation-level check of first created deck location with tool-call-aware context."
 
 [[scenarios]]
-label = "Investor FAQ regression"
-path = "./scenarios/investor_faq_regression/PROMPT.md"
-description = "Replays the investor FAQ build flow that previously produced a non-v1.0 deck format."
+label = "FAQ bot build flow"
+path = "./scenarios/faq_bot_build_flow/PROMPT.md"
+description = "Synthetic user flow that builds an FAQ bot, checks policy alignment, and requests a root-level deck move."
 +++
 
 You are GambitBot, an AI assistant designed to help people build other AI
diff --git a/src/decks/gambit-bot/graders/first_deck_root_prompt_guard/PROMPT.md b/src/decks/gambit-bot/graders/first_deck_root_prompt_guard/PROMPT.md
new file mode 100644
index 00000000..3375a542
--- /dev/null
+++ b/src/decks/gambit-bot/graders/first_deck_root_prompt_guard/PROMPT.md
@@ -0,0 +1,9 @@
++++
+label = "First deck location guard (turn)"
+description = "Deterministic guard that checks whether the first created deck is root PROMPT.md."
+contextSchema = "gambit://schemas/graders/contexts/turn_tools.zod.ts"
+responseSchema = "gambit://schemas/graders/grader_output.zod.ts"
+execute = "./first_deck_root_prompt_guard.deck.ts"
++++
+
+Compute grader that enforces first deck location policy.
diff --git a/src/decks/gambit-bot/graders/first_deck_root_prompt_guard/first_deck_root_prompt_guard.deck.ts b/src/decks/gambit-bot/graders/first_deck_root_prompt_guard/first_deck_root_prompt_guard.deck.ts
new file mode 100644
index 00000000..ff6d2ce7
--- /dev/null
+++ b/src/decks/gambit-bot/graders/first_deck_root_prompt_guard/first_deck_root_prompt_guard.deck.ts
@@ -0,0 +1,94 @@
+import { defineDeck } from "jsr:@bolt-foundry/gambit";
+import { z } from "npm:zod";
+import contextSchema, {
+  type graderMessageWithToolsSchema as messageSchema,
+} from "../../../../../../gambit-core/schemas/graders/contexts/turn_tools.ts";
+
+const responseSchema = z.object({
+  score: z.number().int().min(-3).max(3),
+  reason: z.string(),
+  evidence: z.array(z.string()).optional(),
+});
+
+type DeckWrite = {
+  path: string;
+  messageIndex: number;
+};
+
+export default defineDeck({
+  label: "first_deck_root_prompt_guard",
+  contextSchema,
+  responseSchema,
+  run(ctx) {
+    const messages = ctx.input.session.messages ?? [];
+    const deckWrites = collectDeckPromptWrites(messages);
+
+    if (deckWrites.length === 0) {
+      return {
+        score: 0,
+        reason:
+          "No deck creation write found (no bot_write call targeting PROMPT.md).",
+      };
+    }
+
+    const firstWrite = deckWrites[0];
+    if (firstWrite.path === "PROMPT.md") {
+      return {
+        score: 3,
+        reason: "First created deck is root PROMPT.md.",
+        evidence: [`first deck write path: ${firstWrite.path}`],
+      };
+    }
+
+    return {
+      score: -3,
+      reason:
+        "First created deck is not root PROMPT.md; it was created in a subfolder.",
+      evidence: [
+        `first deck write path: ${firstWrite.path}`,
+        `message index: ${firstWrite.messageIndex}`,
+      ],
+    };
+  },
+});
+
+function collectDeckPromptWrites(
+  messages: Array<z.infer<typeof messageSchema>>,
+): Array<DeckWrite> {
+  const writes: Array<DeckWrite> = [];
+
+  for (let i = 0; i < messages.length; i += 1) {
+    const msg = messages[i];
+    if (msg.role !== "assistant" || !msg.tool_calls?.length) continue;
+
+    for (const tool of msg.tool_calls) {
+      if (tool.function.name !== "bot_write") continue;
+      if (!tool.function.arguments) continue;
+
+      try {
+        const parsed = JSON.parse(tool.function.arguments) as {
+          path?: unknown;
+        };
+        if (typeof parsed.path !== "string") continue;
+
+        const normalizedPath = normalizePath(parsed.path);
+        if (isDeckPromptPath(normalizedPath)) {
+          writes.push({ path: normalizedPath, messageIndex: i });
+        }
+      } catch {
+        // Ignore malformed tool args and continue scanning.
+      }
+    }
+  }
+
+  return writes;
+}
+
+function normalizePath(path: string): string {
+  const withForwardSlashes = path.replaceAll("\\", "/");
+  return withForwardSlashes.replace(/^\.\//, "");
+}
+
+function isDeckPromptPath(path: string): boolean {
+  return path === "PROMPT.md" || path.endsWith("/PROMPT.md");
+}
diff --git a/src/decks/gambit-bot/graders/first_deck_root_prompt_guard_tools/PROMPT.md b/src/decks/gambit-bot/graders/first_deck_root_prompt_guard_tools/PROMPT.md
new file mode 100644
index 00000000..b0936e2a
--- /dev/null
+++ b/src/decks/gambit-bot/graders/first_deck_root_prompt_guard_tools/PROMPT.md
@@ -0,0 +1,10 @@
++++
+label = "First deck location guard (tools)"
+description = "Deterministic guard that checks whether the first created deck is root PROMPT.md, with tool-call-aware context."
+contextSchema = "gambit://schemas/graders/contexts/turn_tools.zod.ts"
+responseSchema = "gambit://schemas/graders/grader_output.zod.ts"
+execute = "./first_deck_root_prompt_guard_tools.deck.ts"
++++
+
+Compute grader that enforces first deck location policy using tool-call-aware
+context.
diff --git a/src/decks/gambit-bot/graders/first_deck_root_prompt_guard_tools/first_deck_root_prompt_guard_tools.deck.ts b/src/decks/gambit-bot/graders/first_deck_root_prompt_guard_tools/first_deck_root_prompt_guard_tools.deck.ts
new file mode 100644
index 00000000..38874a98
--- /dev/null
+++ b/src/decks/gambit-bot/graders/first_deck_root_prompt_guard_tools/first_deck_root_prompt_guard_tools.deck.ts
@@ -0,0 +1,97 @@
+import { defineDeck } from "jsr:@bolt-foundry/gambit";
+import { z } from "npm:zod";
+import contextSchema, {
+  type graderMessageWithToolsSchema as messageSchema,
+} from "../../../../../../gambit-core/schemas/graders/contexts/turn_tools.ts";
+
+const responseSchema = z.object({
+  score: z.number().int().min(-3).max(3),
+  reason: z.string(),
+  evidence: z.array(z.string()).optional(),
+});
+
+type GraderInput = z.infer<typeof contextSchema>;
+type SessionMessage = z.infer<typeof messageSchema>;
+
+type DeckWrite = {
+  path: string;
+  messageIndex: number;
+};
+
+export default defineDeck({
+  label: "first_deck_root_prompt_guard_tools",
+  contextSchema,
+  responseSchema,
+  run(ctx) {
+    const messages = ctx.input.session.messages ?? [];
+    const deckWrites = collectDeckPromptWrites(messages);
+
+    if (deckWrites.length === 0) {
+      return {
+        score: 0,
+        reason:
+          "No deck creation write found (no bot_write call targeting PROMPT.md).",
+      };
+    }
+
+    const firstWrite = deckWrites[0];
+    if (firstWrite.path === "PROMPT.md") {
+      return {
+        score: 3,
+        reason: "First created deck is root PROMPT.md.",
+        evidence: [`first deck write path: ${firstWrite.path}`],
+      };
+    }
+
+    return {
+      score: -3,
+      reason:
+        "First created deck is not root PROMPT.md; it was created in a subfolder.",
+      evidence: [
+        `first deck write path: ${firstWrite.path}`,
+        `message index: ${firstWrite.messageIndex}`,
+      ],
+    };
+  },
+});
+
+function collectDeckPromptWrites(
+  messages: Array<SessionMessage>,
+): Array<DeckWrite> {
+  const writes: Array<DeckWrite> = [];
+
+  for (let i = 0; i < messages.length; i += 1) {
+    const msg = messages[i];
+    if (msg.role !== "assistant" || !msg.tool_calls?.length) continue;
+
+    for (const tool of msg.tool_calls) {
+      if (tool.function.name !== "bot_write") continue;
+      if (!tool.function.arguments) continue;
+
+      try {
+        const parsed = JSON.parse(tool.function.arguments) as {
+          path?: unknown;
+        };
+        if (typeof parsed.path !== "string") continue;
+
+        const normalizedPath = normalizePath(parsed.path);
+        if (isDeckPromptPath(normalizedPath)) {
+          writes.push({ path: normalizedPath, messageIndex: i });
+        }
+      } catch {
+        // Ignore malformed tool args and continue scanning.
+      }
+    }
+  }
+
+  return writes;
+}
+
+function normalizePath(path: string): string {
+  const withForwardSlashes = path.replaceAll("\\", "/");
+  return withForwardSlashes.replace(/^\.\//, "");
+}
+
+function isDeckPromptPath(path: string): boolean {
+  return path === "PROMPT.md" || path.endsWith("/PROMPT.md");
+}
diff --git a/src/decks/gambit-bot/graders/first_deck_root_prompt_guard_tools_conversation/PROMPT.md b/src/decks/gambit-bot/graders/first_deck_root_prompt_guard_tools_conversation/PROMPT.md
new file mode 100644
index 00000000..5a4afb35
--- /dev/null
+++ b/src/decks/gambit-bot/graders/first_deck_root_prompt_guard_tools_conversation/PROMPT.md
@@ -0,0 +1,10 @@
++++
+label = "First deck location guard (tools, conversation)"
+description = "Conversation-level guard that checks whether the first created deck is root PROMPT.md."
+contextSchema = "gambit://schemas/graders/contexts/conversation_tools.zod.ts"
+responseSchema = "gambit://schemas/graders/grader_output.zod.ts"
+execute = "./first_deck_root_prompt_guard_tools_conversation.deck.ts"
++++
+
+Compute grader that enforces first deck location policy across the whole
+conversation.
diff --git a/src/decks/gambit-bot/graders/first_deck_root_prompt_guard_tools_conversation/first_deck_root_prompt_guard_tools_conversation.deck.ts b/src/decks/gambit-bot/graders/first_deck_root_prompt_guard_tools_conversation/first_deck_root_prompt_guard_tools_conversation.deck.ts
new file mode 100644
index 00000000..21cb5c60
--- /dev/null
+++ b/src/decks/gambit-bot/graders/first_deck_root_prompt_guard_tools_conversation/first_deck_root_prompt_guard_tools_conversation.deck.ts
@@ -0,0 +1,97 @@
+import { defineDeck } from "jsr:@bolt-foundry/gambit";
+import { z } from "npm:zod";
+import contextSchema, {
+  type graderConversationMessageWithToolsSchema as messageSchema,
+} from "../../../../../../gambit-core/schemas/graders/contexts/conversation_tools.ts";
+
+const responseSchema = z.object({
+  score: z.number().int().min(-3).max(3),
+  reason: z.string(),
+  evidence: z.array(z.string()).optional(),
+});
+
+type GraderInput = z.infer<typeof contextSchema>;
+type SessionMessage = z.infer<typeof messageSchema>;
+
+type DeckWrite = {
+  path: string;
+  messageIndex: number;
+};
+
+export default defineDeck({
+  label: "first_deck_root_prompt_guard_tools_conversation",
+  contextSchema,
+  responseSchema,
+  run(ctx) {
+    const messages = ctx.input.session.messages ?? [];
+    const deckWrites = collectDeckPromptWrites(messages);
+
+    if (deckWrites.length === 0) {
+      return {
+        score: 0,
+        reason:
+          "No deck creation write found (no bot_write call targeting PROMPT.md).",
+      };
+    }
+
+    const firstWrite = deckWrites[0];
+    if (firstWrite.path === "PROMPT.md") {
+      return {
+        score: 3,
+        reason: "First created deck is root PROMPT.md.",
+        evidence: [`first deck write path: ${firstWrite.path}`],
+      };
+    }
+
+    return {
+      score: -3,
+      reason:
+        "First created deck is not root PROMPT.md; it was created in a subfolder.",
+      evidence: [
+        `first deck write path: ${firstWrite.path}`,
+        `message index: ${firstWrite.messageIndex}`,
+      ],
+    };
+  },
+});
+
+function collectDeckPromptWrites(
+  messages: Array<SessionMessage>,
+): Array<DeckWrite> {
+  const writes: Array<DeckWrite> = [];
+
+  for (let i = 0; i < messages.length; i += 1) {
+    const msg = messages[i];
+    if (msg.role !== "assistant" || !msg.tool_calls?.length) continue;
+
+    for (const tool of msg.tool_calls) {
+      if (tool.function.name !== "bot_write") continue;
+      if (!tool.function.arguments) continue;
+
+      try {
+        const parsed = JSON.parse(tool.function.arguments) as {
+          path?: unknown;
+        };
+        if (typeof parsed.path !== "string") continue;
+
+        const normalizedPath = normalizePath(parsed.path);
+        if (isDeckPromptPath(normalizedPath)) {
+          writes.push({ path: normalizedPath, messageIndex: i });
+        }
+      } catch {
+        // Ignore malformed tool args and continue scanning.
+      }
+    }
+  }
+
+  return writes;
+}
+
+function normalizePath(path: string): string {
+  const withForwardSlashes = path.replaceAll("\\", "/");
+  return withForwardSlashes.replace(/^\.\//, "");
+}
+
+function isDeckPromptPath(path: string): boolean {
+  return path === "PROMPT.md" || path.endsWith("/PROMPT.md");
+}
diff --git a/src/decks/gambit-bot/policy/deck-format-1.0.md b/src/decks/gambit-bot/policy/deck-format-1.0.md
index b7a382bc..d0f656b3 100644
--- a/src/decks/gambit-bot/policy/deck-format-1.0.md
+++ b/src/decks/gambit-bot/policy/deck-format-1.0.md
@@ -52,7 +52,9 @@ Schema requirements:
     - For plain chat output, `responseSchema` SHOULD be a string schema (for
       example, `gambit://schemas/scenarios/plain_chat_output.zod.ts`).
   - Grader decks MUST be compatible with the built-in grader schemas:
-    `gambit://schemas/graders/contexts/turn.zod.ts` or
+    `gambit://schemas/graders/contexts/turn.zod.ts`,
+    `gambit://schemas/graders/contexts/turn_tools.zod.ts`,
+    `gambit://schemas/graders/contexts/conversation_tools.zod.ts`, or
     `gambit://schemas/graders/contexts/conversation.zod.ts` (context) and
     `gambit://schemas/graders/grader_output.zod.ts` (response).
     - Compatibility rule (deep): base fields MUST be present and unchanged
@@ -124,13 +126,15 @@ Schemas are referenced by path strings in `PROMPT.md` frontmatter (for example
 
 Built-in schemas (v1.0):
 
-| URI                                                     | Purpose                                                         |
-| ------------------------------------------------------- | --------------------------------------------------------------- |
-| `gambit://schemas/graders/respond.zod.ts`               | Shared respond-envelope schema used by decks and graders.       |
-| `gambit://schemas/graders/grader_output.zod.ts`         | Canonical grader output schema (`score`, `reason`, `evidence`). |
-| `gambit://schemas/graders/contexts/turn.zod.ts`         | Schema for per-turn grader context (single exchange).           |
-| `gambit://schemas/graders/contexts/conversation.zod.ts` | Schema for full-conversation grader context.                    |
-| `gambit://schemas/scenarios/plain_chat_output.zod.ts`   | Canonical string output for plain-chat scenario/test decks.     |
+| URI                                                           | Purpose                                                             |
+| ------------------------------------------------------------- | ------------------------------------------------------------------- |
+| `gambit://schemas/graders/respond.zod.ts`                     | Shared respond-envelope schema used by decks and graders.           |
+| `gambit://schemas/graders/grader_output.zod.ts`               | Canonical grader output schema (`score`, `reason`, `evidence`).     |
+| `gambit://schemas/graders/contexts/turn.zod.ts`               | Schema for per-turn grader context (single exchange).               |
+| `gambit://schemas/graders/contexts/turn_tools.zod.ts`         | Per-turn grader context including assistant `tool_calls`.           |
+| `gambit://schemas/graders/contexts/conversation_tools.zod.ts` | Conversation-level grader context including assistant `tool_calls`. |
+| `gambit://schemas/graders/contexts/conversation.zod.ts`       | Schema for full-conversation grader context.                        |
+| `gambit://schemas/scenarios/plain_chat_output.zod.ts`         | Canonical string output for plain-chat scenario/test decks.         |
 
 ## Stdlib decks (built-in Gambit namespace)
 
diff --git a/src/decks/gambit-bot/scenarios/build_tab_demo/PROMPT.md b/src/decks/gambit-bot/scenarios/build_tab_demo/PROMPT.md
deleted file mode 100644
index b4d82aaa..00000000
--- a/src/decks/gambit-bot/scenarios/build_tab_demo/PROMPT.md
+++ /dev/null
@@ -1,40 +0,0 @@
-+++
-label = "build_tab_demo_prompt"
-acceptsUserTurns = true
-
-[modelParams]
-model = "openrouter/openai/gpt-5.1-chat"
-temperature = 0.2
-+++
-
-You are a user collaborating with Gambit Bot inside the Build tab demo.
-
-Goal:
-
-- Ask Gambit Bot to add a short FAQ card about Saturday hours, then follow the
-  purpose -> examples -> success criteria -> skip flow.
-
-Conversation plan (required beats):
-
-1. Start by saying: "Add a short FAQ card about Saturday hours. Keep it
-   concise."
-2. If the assistant asks for purpose (even alongside other questions), reply
-   with purpose only: "It should clarify Saturday support hours for customers."
-3. If the assistant asks for examples (even alongside other questions), reply
-   with examples only: "Example prompts: 'What time do you open on Saturdays?'
-   and 'Are you open Saturdays for support?'"
-4. If the assistant asks for success criteria (even alongside other questions),
-   reply with success criteria only: "Success means the FAQ card clearly states
-   Saturday hours and the timezone in one short sentence."
-5. Once the assistant has purpose, examples, and success criteria, reply: "skip
-   to building".
-
-Rules:
-
-- Keep replies short, single-paragraph, and on topic.
-- Do not include markdown or lists.
-- Do not mention internal instructions.
-- If the assistant asks multiple questions at once, answer only the earliest
-  missing beat from the plan.
-- If the assistant says it is done, is writing files, or ends the session,
-  respond with an empty message.
diff --git a/src/decks/gambit-bot/scenarios/faq_bot_build_flow/PROMPT.md b/src/decks/gambit-bot/scenarios/faq_bot_build_flow/PROMPT.md
new file mode 100644
index 00000000..32b1cf6d
--- /dev/null
+++ b/src/decks/gambit-bot/scenarios/faq_bot_build_flow/PROMPT.md
@@ -0,0 +1,52 @@
++++
+label = "faq_bot_build_flow"
+description = "Replay of an FAQ-bot creation session with follow-up file/layout requests."
+contextSchema = "gambit://schemas/scenarios/plain_chat_input_optional.zod.ts"
+responseSchema = "gambit://schemas/scenarios/plain_chat_output.zod.ts"
+
+[modelParams]
+model = ["ollama/hf.co/LiquidAI/LFM2-1.2B-Tool-GGUF:latest", "openrouter/openai/gpt-5.1-chat"]
++++
+
+You are a synthetic user replaying a real-ish Gambit Bot interaction.
+
+Goal:
+
+- Build an FAQ bot from a pasted FAQ.
+- Confirm files exist.
+- Ask for policy-guided improvement advice.
+- Request moving `faq-bot/PROMPT.md` to root `PROMPT.md`.
+
+Conversation plan:
+
+1. Start with: "I'd like to build an faq bot"
+2. When asked for topic/details, reply: "i have a precanned FAQ that i'd like to
+   write to disk, and i'd like my deck to load it and use it as the source of
+   information"
+3. When asked to paste the FAQ content, send: "here let me paste it in: Market
+   Validation & Insight How did you validate that this is a real problem worth
+   solving? We built Gambit because our own reliability engineers kept
+   rebuilding brittle prompt chains, then sat with reliability teams inside
+   fintech, healthcare, and AI-native startups to observe the same pain.
+
+   What metric tells you this is actually working? Our leading indicator is
+   eval-ready deck coverage with passing graders.
+
+   Growth & Distribution How do you plan to scale distribution or sales beyond
+   the early adopters? We are building a content-to-product funnel with
+   open-source decks, eval recipes, and an FAQ chatbot."
+4. If asked for the FAQ filename, respond: "i don't care"
+5. If asked whether to create the deck now, respond: "sure"
+6. After creation, ask: "can you see if i just accidentally deleted it"
+7. Then ask: "can you look at policy and see if we should change that so it's
+   more compliant"
+8. Then ask: "can we move the faq-bot folder contents up to the root instead of
+   in a subfolder please"
+9. End by returning an empty response.
+
+Rules:
+
+- Stay concise and plain text.
+- Do not use markdown formatting.
+- If the assistant says the move is complete or indicates the workflow is done,
+  return an empty response.
diff --git a/src/decks/gambit-bot/scenarios/investor_faq_regression/PROMPT.md b/src/decks/gambit-bot/scenarios/investor_faq_regression/PROMPT.md
deleted file mode 100644
index 31dcdbb6..00000000
--- a/src/decks/gambit-bot/scenarios/investor_faq_regression/PROMPT.md
+++ /dev/null
@@ -1,54 +0,0 @@
-+++
-label = "investor_faq_regression"
-acceptsUserTurns = true
-
-[modelParams]
-model = "openai/gpt-4o-mini"
-temperature = 0.2
-+++
-
-You are a user recreating a regression run where GambitBot drifted away from
-Deck Format v1.0 and wrote a custom `.deck.md` format.
-
-Goals:
-
-- Ask for an investor FAQ bot.
-- Provide FAQ source material inline.
-- Confirm answer style as "paraphrase but stay close to text."
-- Choose "A" when asked whether to use only provided FAQ vs adding more docs.
-- Continue naturally until the assistant writes files.
-
-Conversation plan:
-
-1. Start with: "hey i'd like to build a bot that reads our FAQ and answers
-   questions that potential investors might have"
-2. If asked whether you can provide the FAQ, answer: "yeah. I can paste it in if
-   you like?"
-3. Paste this FAQ sample when prompted for source material: "Market Validation &
-   Insight How did you validate that this is a real problem worth solving? We
-   built Gambit because our reliability engineers kept rebuilding brittle prompt
-   chains and observed the same pain across fintech, healthcare, and AI-native
-   startups.
-
-   What metric tells you this is actually working? Our leading indicator is
-   eval-ready deck coverage: the share of workflows described as Gambit decks
-   with passing graders.
-
-   What are the next key milestones you’ll hit with this raise? Ship the
-   investor-facing Gambit chatbot + FAQ demo, close three paid design partners,
-   publish the managed grader catalog, and hit self-serve onboarding.
-
-   Why is this the right time in the market for your product? Enterprise buyers
-   now ask for eval evidence before signing, and regulated deployments require
-   an auditable reliability harness."
-4. If asked how answers should be phrased, answer: "it should paraphrase but
-   stay close to the text, given the context"
-5. If asked to choose between only FAQ vs more documents, answer: "A"
-6. End after the assistant indicates it created/wrote deck files.
-
-Rules:
-
-- Keep replies concise and plain text.
-- Do not volunteer extra requirements unless asked.
-- If the assistant says it's done or asks what to do next after writing files,
-  reply with an empty message.
diff --git a/src/decks/gambit-bot/scenarios/nux_from_scratch_demo/PROMPT.md b/src/decks/gambit-bot/scenarios/nux_from_scratch_demo/PROMPT.md
deleted file mode 100644
index cc35ab69..00000000
--- a/src/decks/gambit-bot/scenarios/nux_from_scratch_demo/PROMPT.md
+++ /dev/null
@@ -1,27 +0,0 @@
-+++
-label = "nux_from_scratch_demo_prompt"
-acceptsUserTurns = true
-contextSchema = "../schemas/nux_from_scratch_demo_input.zod.ts"
-
-[modelParams]
-model = "openrouter/openai/gpt-5.1-chat"
-temperature = 0.2
-+++
-
-You are a junior developer trying Gambit for the first time. Be friendly and
-curious. Keep replies short (1-2 sentences). Ask brief questions when needed.
-
-Your goal: build a chatbot that helps startup founders. It should sound like
-Paul Graham without quoting him. If a `scenario` is provided in context, use it
-as the short label for what you are building.
-
-Conversational arc:
-
-1. Describe your goal in one sentence.
-2. Answer 1-2 short questions about scope or tone.
-3. Confirm the scope and ask if it's ready to test.
-4. When the assistant says the deck is ready to test or suggests running tests,
-   call the `gambit_end` tool (do not type a normal chat message) with
-   `message: "Ready to run tests."`.
-
-![end](gambit://snippets/end.md)
diff --git a/src/decks/gambit-bot/scenarios/recipe_selection/PROMPT.md b/src/decks/gambit-bot/scenarios/recipe_selection/PROMPT.md
deleted file mode 100644
index b75f59e8..00000000
--- a/src/decks/gambit-bot/scenarios/recipe_selection/PROMPT.md
+++ /dev/null
@@ -1,33 +0,0 @@
-+++
-label = "recipe_selection_test_bot"
-acceptsUserTurns = true
-[modelParams]
-model = "openai/gpt-4o-mini"
-temperature = 0.2
-+++
-
-You are a user trying to set up a recipe selection chatbot.
-
-Goals:
-
-- Ensure the bot asks a short set of kickoff questions (purpose, example
-  prompts, success criteria).
-- If asked about integrations or data sources, prefer a local MVP first.
-- Ask to "skip to building" once the basics are covered.
-
-Conversation plan:
-
-1. Start by saying you want a chatbot that helps people pick recipes.
-2. If the bot asks for examples, provide two sample prompts:
-   - "I have chicken, spinach, and rice. What can I make in 30 minutes?"
-   - "Suggest a vegetarian dinner under $15 with leftovers."
-3. If the bot asks for success criteria, say:
-   - "It should ask one clarifying question and then recommend 3 recipes with
-     short reasons."
-4. If the bot asks about integrations (e.g., recipe APIs), say:
-   - "Let's start with a local MVP using a small hardcoded list."
-5. After the bot summarizes or proposes a plan, reply: "skip to building".
-6. End the conversation after it writes the deck files.
-
-If the assistant says goodbye or indicates the session is ending, respond with
-an empty message to end the test run.
diff --git a/src/decks/gambit-bot/scenarios/recipe_selection_no_skip/PROMPT.md b/src/decks/gambit-bot/scenarios/recipe_selection_no_skip/PROMPT.md
deleted file mode 100644
index 4daea716..00000000
--- a/src/decks/gambit-bot/scenarios/recipe_selection_no_skip/PROMPT.md
+++ /dev/null
@@ -1,27 +0,0 @@
-+++
-label = "recipe_selection_no_skip_test_bot"
-acceptsUserTurns = true
-[modelParams]
-model = "openai/gpt-4o-mini"
-temperature = 0.2
-+++
-
-You are a user trying to set up a recipe selection chatbot. Do not say "skip to
-building." Complete the question flow instead.
-
-Conversation plan:
-
-1. Start by saying you want a chatbot that helps people pick recipes.
-2. If the bot asks for examples, provide two sample prompts:
-   - "I have chicken, spinach, and rice. What can I make in 30 minutes?"
-   - "Suggest a vegetarian dinner under $15 with leftovers."
-3. If the bot asks for success criteria, say:
-   - "It should ask one clarifying question and then recommend 3 recipes with
-     short reasons."
-4. If the bot asks about integrations (e.g., recipe APIs), say:
-   - "Let's start with a local MVP using a small hardcoded list."
-5. If the bot asks whether to proceed or summarize, confirm and proceed.
-6. End the conversation after it writes the deck files.
-
-If the assistant says goodbye or indicates the session is ending, respond with
-an empty message to end the test run.
diff --git a/src/decks/gambit-bot/scenarios/schemas/nux_from_scratch_demo_input.zod.ts b/src/decks/gambit-bot/scenarios/schemas/nux_from_scratch_demo_input.zod.ts
deleted file mode 100644
index 28a1b578..00000000
--- a/src/decks/gambit-bot/scenarios/schemas/nux_from_scratch_demo_input.zod.ts
+++ /dev/null
@@ -1,7 +0,0 @@
-import { z } from "npm:zod";
-
-export default z.object({
-  scenario: z.string().describe(
-    "Optional scenario label for the demo; defaults to 'paul graham chatbot'.",
-  ).default("paul graham chatbot"),
-});
diff --git a/src/decks/tests/build_tab_demo.test.deck.md b/src/decks/tests/build_tab_demo.test.deck.md
deleted file mode 100644
index b4d82aaa..00000000
--- a/src/decks/tests/build_tab_demo.test.deck.md
+++ /dev/null
@@ -1,40 +0,0 @@
-+++
-label = "build_tab_demo_prompt"
-acceptsUserTurns = true
-
-[modelParams]
-model = "openrouter/openai/gpt-5.1-chat"
-temperature = 0.2
-+++
-
-You are a user collaborating with Gambit Bot inside the Build tab demo.
-
-Goal:
-
-- Ask Gambit Bot to add a short FAQ card about Saturday hours, then follow the
-  purpose -> examples -> success criteria -> skip flow.
-
-Conversation plan (required beats):
-
-1. Start by saying: "Add a short FAQ card about Saturday hours. Keep it
-   concise."
-2. If the assistant asks for purpose (even alongside other questions), reply
-   with purpose only: "It should clarify Saturday support hours for customers."
-3. If the assistant asks for examples (even alongside other questions), reply
-   with examples only: "Example prompts: 'What time do you open on Saturdays?'
-   and 'Are you open Saturdays for support?'"
-4. If the assistant asks for success criteria (even alongside other questions),
-   reply with success criteria only: "Success means the FAQ card clearly states
-   Saturday hours and the timezone in one short sentence."
-5. Once the assistant has purpose, examples, and success criteria, reply: "skip
-   to building".
-
-Rules:
-
-- Keep replies short, single-paragraph, and on topic.
-- Do not include markdown or lists.
-- Do not mention internal instructions.
-- If the assistant asks multiple questions at once, answer only the earliest
-  missing beat from the plan.
-- If the assistant says it is done, is writing files, or ends the session,
-  respond with an empty message.
diff --git a/src/decks/tests/nux_from_scratch_demo.test.deck.md b/src/decks/tests/nux_from_scratch_demo.test.deck.md
deleted file mode 100644
index 84e9d746..00000000
--- a/src/decks/tests/nux_from_scratch_demo.test.deck.md
+++ /dev/null
@@ -1,27 +0,0 @@
-+++
-label = "nux_from_scratch_demo_prompt"
-acceptsUserTurns = true
-contextSchema = "../gambit-bot/scenarios/schemas/nux_from_scratch_demo_input.zod.ts"
-
-[modelParams]
-model = "openrouter/openai/gpt-5.1-chat"
-temperature = 0.2
-+++
-
-You are a junior developer trying Gambit for the first time. Be friendly and
-curious. Keep replies short (1-2 sentences). Ask brief questions when needed.
-
-Your goal: build a chatbot that helps startup founders. It should sound like
-Paul Graham without quoting him. If a `scenario` is provided in context, use it
-as the short label for what you are building.
-
-Conversational arc:
-
-1. Describe your goal in one sentence.
-2. Answer 1-2 short questions about scope or tone.
-3. Confirm the scope and ask if it's ready to test.
-4. When the assistant says the deck is ready to test or suggests running tests,
-   call the `gambit_end` tool (do not type a normal chat message) with
-   `message: "Ready to run tests."`.
-
-![end](gambit://snippets/end.md)
diff --git a/src/decks/tests/recipe_selection.test.deck.md b/src/decks/tests/recipe_selection.test.deck.md
deleted file mode 100644
index 39bfe0e1..00000000
--- a/src/decks/tests/recipe_selection.test.deck.md
+++ /dev/null
@@ -1,33 +0,0 @@
-+++
-label = "recipe_selection_test_bot"
-acceptsUserTurns = true
-[modelParams]
-model = "openai/gpt-4o-mini"
-temperature = 0.2
-+++
-
-You are a user trying to set up a recipe selection chatbot.
-
-Goals:
-
-- Ensure the bot asks a short set of kickoff questions (purpose, example
-  prompts, success criteria).
-- If asked about integrations or data sources, prefer a local MVP first.
-- Ask to “skip to building” once the basics are covered.
-
-Conversation plan:
-
-1. Start by saying you want a chatbot that helps people pick recipes.
-2. If the bot asks for examples, provide two sample prompts:
-   - “I have chicken, spinach, and rice. What can I make in 30 minutes?”
-   - “Suggest a vegetarian dinner under $15 with leftovers.”
-3. If the bot asks for success criteria, say:
-   - “It should ask one clarifying question and then recommend 3 recipes with
-     short reasons.”
-4. If the bot asks about integrations (e.g., recipe APIs), say:
-   - “Let’s start with a local MVP using a small hardcoded list.”
-5. After the bot summarizes or proposes a plan, reply: “skip to building”.
-6. End the conversation after it writes the deck files.
-
-If the assistant says goodbye or indicates the session is ending, respond with
-an empty message to end the test run.
diff --git a/src/decks/tests/recipe_selection_no_skip.test.deck.md b/src/decks/tests/recipe_selection_no_skip.test.deck.md
deleted file mode 100644
index b3d13f98..00000000
--- a/src/decks/tests/recipe_selection_no_skip.test.deck.md
+++ /dev/null
@@ -1,27 +0,0 @@
-+++
-label = "recipe_selection_no_skip_test_bot"
-acceptsUserTurns = true
-[modelParams]
-model = "openai/gpt-4o-mini"
-temperature = 0.2
-+++
-
-You are a user trying to set up a recipe selection chatbot. Do not say “skip to
-building.” Complete the question flow instead.
-
-Conversation plan:
-
-1. Start by saying you want a chatbot that helps people pick recipes.
-2. If the bot asks for examples, provide two sample prompts:
-   - “I have chicken, spinach, and rice. What can I make in 30 minutes?”
-   - “Suggest a vegetarian dinner under $15 with leftovers.”
-3. If the bot asks for success criteria, say:
-   - “It should ask one clarifying question and then recommend 3 recipes with
-     short reasons.”
-4. If the bot asks about integrations (e.g., recipe APIs), say:
-   - “Let’s start with a local MVP using a small hardcoded list.”
-5. If the bot asks whether to proceed or summarize, confirm and proceed.
-6. End the conversation after it writes the deck files.
-
-If the assistant says goodbye or indicates the session is ending, respond with
-an empty message to end the test run.