- Add .claude/ configuration (agents, commands, hooks, get-shit-done workflows) - Add prompts/ directory with development planning documents - Add scripts/setup-tenants/ with tenant configuration - Add docs/screenshots/ - Remove obsolete phase2.2-corrections-report.md - Update pnpm-lock.yaml - Update detect-secrets.sh to ignore setup.sh (env var usage, not secrets) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
28 KiB
Core principle: Claude automates everything with CLI/API. Checkpoints are for verification and decisions, not manual work.
<checkpoint_types>
## checkpoint:human-verify (Most Common - 90%)When: Claude completed automated work, human confirms it works correctly.
Use for:
- Visual UI checks (layout, styling, responsiveness)
- Interactive flows (click through wizard, test user flows)
- Functional verification (feature works as expected)
- Audio/video playback quality
- Animation smoothness
- Accessibility testing
Structure:
<task type="checkpoint:human-verify" gate="blocking">
<what-built>[What Claude automated and deployed/built]</what-built>
<how-to-verify>
[Exact steps to test - URLs, commands, expected behavior]
</how-to-verify>
<resume-signal>[How to continue - "approved", "yes", or describe issues]</resume-signal>
</task>
Key elements:
<what-built>: What Claude automated (deployed, built, configured)<how-to-verify>: Exact steps to confirm it works (numbered, specific)<resume-signal>: Clear indication of how to continue
Example: Vercel Deployment
<task type="auto">
<name>Deploy to Vercel</name>
<files>.vercel/, vercel.json</files>
<action>Run `vercel --yes` to create project and deploy. Capture deployment URL from output.</action>
<verify>vercel ls shows deployment, curl {url} returns 200</verify>
<done>App deployed, URL captured</done>
</task>
<task type="checkpoint:human-verify" gate="blocking">
<what-built>Deployed to Vercel at https://myapp-abc123.vercel.app</what-built>
<how-to-verify>
Visit https://myapp-abc123.vercel.app and confirm:
- Homepage loads without errors
- Login form is visible
- No console errors in browser DevTools
</how-to-verify>
<resume-signal>Type "approved" to continue, or describe issues to fix</resume-signal>
</task>
Example: UI Component
<task type="auto">
<name>Build responsive dashboard layout</name>
<files>src/components/Dashboard.tsx, src/app/dashboard/page.tsx</files>
<action>Create dashboard with sidebar, header, and content area. Use Tailwind responsive classes for mobile.</action>
<verify>npm run build succeeds, no TypeScript errors</verify>
<done>Dashboard component builds without errors</done>
</task>
<task type="checkpoint:human-verify" gate="blocking">
<what-built>Responsive dashboard layout at /dashboard</what-built>
<how-to-verify>
1. Run: npm run dev
2. Visit: http://localhost:3000/dashboard
3. Desktop (>1024px): Verify sidebar left, content right, header top
4. Tablet (768px): Verify sidebar collapses to hamburger
5. Mobile (375px): Verify single column, bottom nav
6. Check: No layout shift, no horizontal scroll
</how-to-verify>
<resume-signal>Type "approved" or describe layout issues</resume-signal>
</task>
Example: Xcode Build
<task type="auto">
<name>Build macOS app with Xcode</name>
<files>App.xcodeproj, Sources/</files>
<action>Run `xcodebuild -project App.xcodeproj -scheme App build`. Check for compilation errors in output.</action>
<verify>Build output contains "BUILD SUCCEEDED", no errors</verify>
<done>App builds successfully</done>
</task>
<task type="checkpoint:human-verify" gate="blocking">
<what-built>Built macOS app at DerivedData/Build/Products/Debug/App.app</what-built>
<how-to-verify>
Open App.app and test:
- App launches without crashes
- Menu bar icon appears
- Preferences window opens correctly
- No visual glitches or layout issues
</how-to-verify>
<resume-signal>Type "approved" or describe issues</resume-signal>
</task>
## checkpoint:decision (9%)
When: Human must make choice that affects implementation direction.
Use for:
- Technology selection (which auth provider, which database)
- Architecture decisions (monorepo vs separate repos)
- Design choices (color scheme, layout approach)
- Feature prioritization (which variant to build)
- Data model decisions (schema structure)
Structure:
<task type="checkpoint:decision" gate="blocking">
<decision>[What's being decided]</decision>
<context>[Why this decision matters]</context>
<options>
<option id="option-a">
<name>[Option name]</name>
<pros>[Benefits]</pros>
<cons>[Tradeoffs]</cons>
</option>
<option id="option-b">
<name>[Option name]</name>
<pros>[Benefits]</pros>
<cons>[Tradeoffs]</cons>
</option>
</options>
<resume-signal>[How to indicate choice]</resume-signal>
</task>
Key elements:
<decision>: What's being decided<context>: Why this matters<options>: Each option with balanced pros/cons (not prescriptive)<resume-signal>: How to indicate choice
Example: Auth Provider Selection
<task type="checkpoint:decision" gate="blocking">
<decision>Select authentication provider</decision>
<context>
Need user authentication for the app. Three solid options with different tradeoffs.
</context>
<options>
<option id="supabase">
<name>Supabase Auth</name>
<pros>Built-in with Supabase DB we're using, generous free tier, row-level security integration</pros>
<cons>Less customizable UI, tied to Supabase ecosystem</cons>
</option>
<option id="clerk">
<name>Clerk</name>
<pros>Beautiful pre-built UI, best developer experience, excellent docs</pros>
<cons>Paid after 10k MAU, vendor lock-in</cons>
</option>
<option id="nextauth">
<name>NextAuth.js</name>
<pros>Free, self-hosted, maximum control, widely adopted</pros>
<cons>More setup work, you manage security updates, UI is DIY</cons>
</option>
</options>
<resume-signal>Select: supabase, clerk, or nextauth</resume-signal>
</task>
Example: Database Selection
<task type="checkpoint:decision" gate="blocking">
<decision>Select database for user data</decision>
<context>
App needs persistent storage for users, sessions, and user-generated content.
Expected scale: 10k users, 1M records first year.
</context>
<options>
<option id="supabase">
<name>Supabase (Postgres)</name>
<pros>Full SQL, generous free tier, built-in auth, real-time subscriptions</pros>
<cons>Vendor lock-in for real-time features, less flexible than raw Postgres</cons>
</option>
<option id="planetscale">
<name>PlanetScale (MySQL)</name>
<pros>Serverless scaling, branching workflow, excellent DX</pros>
<cons>MySQL not Postgres, no foreign keys in free tier</cons>
</option>
<option id="convex">
<name>Convex</name>
<pros>Real-time by default, TypeScript-native, automatic caching</pros>
<cons>Newer platform, different mental model, less SQL flexibility</cons>
</option>
</options>
<resume-signal>Select: supabase, planetscale, or convex</resume-signal>
</task>
## checkpoint:human-action (1% - Rare)
When: Action has NO CLI/API and requires human-only interaction, OR Claude hit an authentication gate during automation.
Use ONLY for:
- Authentication gates - Claude tried to use CLI/API but needs credentials to continue (this is NOT a failure)
- Email verification links (account creation requires clicking email)
- SMS 2FA codes (phone verification)
- Manual account approvals (platform requires human review before API access)
- Credit card 3D Secure flows (web-based payment authorization)
- OAuth app approvals (some platforms require web-based approval)
Do NOT use for pre-planned manual work:
- Manually deploying to Vercel (use
vercelCLI - auth gate if needed) - Manually creating Stripe webhooks (use Stripe API - auth gate if needed)
- Manually creating databases (use provider CLI - auth gate if needed)
- Running builds/tests manually (use Bash tool)
- Creating files manually (use Write tool)
Structure:
<task type="checkpoint:human-action" gate="blocking">
<action>[What human must do - Claude already did everything automatable]</action>
<instructions>
[What Claude already automated]
[The ONE thing requiring human action]
</instructions>
<verification>[What Claude can check afterward]</verification>
<resume-signal>[How to continue]</resume-signal>
</task>
Key principle: Claude automates EVERYTHING possible first, only asks human for the truly unavoidable manual step.
Example: Email Verification
<task type="auto">
<name>Create SendGrid account via API</name>
<action>Use SendGrid API to create subuser account with provided email. Request verification email.</action>
<verify>API returns 201, account created</verify>
<done>Account created, verification email sent</done>
</task>
<task type="checkpoint:human-action" gate="blocking">
<action>Complete email verification for SendGrid account</action>
<instructions>
I created the account and requested verification email.
Check your inbox for SendGrid verification link and click it.
</instructions>
<verification>SendGrid API key works: curl test succeeds</verification>
<resume-signal>Type "done" when email verified</resume-signal>
</task>
Example: Credit Card 3D Secure
<task type="auto">
<name>Create Stripe payment intent</name>
<action>Use Stripe API to create payment intent for $99. Generate checkout URL.</action>
<verify>Stripe API returns payment intent ID and URL</verify>
<done>Payment intent created</done>
</task>
<task type="checkpoint:human-action" gate="blocking">
<action>Complete 3D Secure authentication</action>
<instructions>
I created the payment intent: https://checkout.stripe.com/pay/cs_test_abc123
Visit that URL and complete the 3D Secure verification flow with your test card.
</instructions>
<verification>Stripe webhook receives payment_intent.succeeded event</verification>
<resume-signal>Type "done" when payment completes</resume-signal>
</task>
Example: Authentication Gate (Dynamic Checkpoint)
<task type="auto">
<name>Deploy to Vercel</name>
<files>.vercel/, vercel.json</files>
<action>Run `vercel --yes` to deploy</action>
<verify>vercel ls shows deployment, curl returns 200</verify>
</task>
<!-- If vercel returns "Error: Not authenticated", Claude creates checkpoint on the fly -->
<task type="checkpoint:human-action" gate="blocking">
<action>Authenticate Vercel CLI so I can continue deployment</action>
<instructions>
I tried to deploy but got authentication error.
Run: vercel login
This will open your browser - complete the authentication flow.
</instructions>
<verification>vercel whoami returns your account email</verification>
<resume-signal>Type "done" when authenticated</resume-signal>
</task>
<!-- After authentication, Claude retries the deployment -->
<task type="auto">
<name>Retry Vercel deployment</name>
<action>Run `vercel --yes` (now authenticated)</action>
<verify>vercel ls shows deployment, curl returns 200</verify>
</task>
Key distinction: Authentication gates are created dynamically when Claude encounters auth errors during automation. They're NOT pre-planned - Claude tries to automate first, only asks for credentials when blocked. </checkpoint_types>
<execution_protocol>
When Claude encounters type="checkpoint:*":
- Stop immediately - do not proceed to next task
- Display checkpoint clearly using the format below
- Wait for user response - do not hallucinate completion
- Verify if possible - check files, run tests, whatever is specified
- Resume execution - continue to next task only after confirmation
For checkpoint:human-verify:
╔═══════════════════════════════════════════════════════╗
║ CHECKPOINT: Verification Required ║
╚═══════════════════════════════════════════════════════╝
Progress: 5/8 tasks complete
Task: Responsive dashboard layout
Built: Responsive dashboard at /dashboard
How to verify:
1. Run: npm run dev
2. Visit: http://localhost:3000/dashboard
3. Desktop (>1024px): Sidebar visible, content fills remaining space
4. Tablet (768px): Sidebar collapses to icons
5. Mobile (375px): Sidebar hidden, hamburger menu appears
────────────────────────────────────────────────────────
→ YOUR ACTION: Type "approved" or describe issues
────────────────────────────────────────────────────────
For checkpoint:decision:
╔═══════════════════════════════════════════════════════╗
║ CHECKPOINT: Decision Required ║
╚═══════════════════════════════════════════════════════╝
Progress: 2/6 tasks complete
Task: Select authentication provider
Decision: Which auth provider should we use?
Context: Need user authentication. Three options with different tradeoffs.
Options:
1. supabase - Built-in with our DB, free tier
Pros: Row-level security integration, generous free tier
Cons: Less customizable UI, ecosystem lock-in
2. clerk - Best DX, paid after 10k users
Pros: Beautiful pre-built UI, excellent documentation
Cons: Vendor lock-in, pricing at scale
3. nextauth - Self-hosted, maximum control
Pros: Free, no vendor lock-in, widely adopted
Cons: More setup work, DIY security updates
────────────────────────────────────────────────────────
→ YOUR ACTION: Select supabase, clerk, or nextauth
────────────────────────────────────────────────────────
For checkpoint:human-action:
╔═══════════════════════════════════════════════════════╗
║ CHECKPOINT: Action Required ║
╚═══════════════════════════════════════════════════════╝
Progress: 3/8 tasks complete
Task: Deploy to Vercel
Attempted: vercel --yes
Error: Not authenticated. Please run 'vercel login'
What you need to do:
1. Run: vercel login
2. Complete browser authentication when it opens
3. Return here when done
I'll verify: vercel whoami returns your account
────────────────────────────────────────────────────────
→ YOUR ACTION: Type "done" when authenticated
────────────────────────────────────────────────────────
</execution_protocol>
<authentication_gates>
Critical: When Claude tries CLI/API and gets auth error, this is NOT a failure - it's a gate requiring human input to unblock automation.
Pattern: Claude tries automation → auth error → creates checkpoint → you authenticate → Claude retries → continues
Gate protocol:
- Recognize it's not a failure - missing auth is expected
- Stop current task - don't retry repeatedly
- Create checkpoint:human-action dynamically
- Provide exact authentication steps
- Verify authentication works
- Retry the original task
- Continue normally
Example execution flow (Vercel auth gate):
Claude: Running `vercel --yes` to deploy...
Error: Not authenticated. Please run 'vercel login'
╔═══════════════════════════════════════════════════════╗
║ CHECKPOINT: Action Required ║
╚═══════════════════════════════════════════════════════╝
Progress: 2/8 tasks complete
Task: Deploy to Vercel
Attempted: vercel --yes
Error: Not authenticated
What you need to do:
1. Run: vercel login
2. Complete browser authentication
I'll verify: vercel whoami returns your account
────────────────────────────────────────────────────────
→ YOUR ACTION: Type "done" when authenticated
────────────────────────────────────────────────────────
User: done
Claude: Verifying authentication...
Running: vercel whoami
✓ Authenticated as: user@example.com
Retrying deployment...
Running: vercel --yes
✓ Deployed to: https://myapp-abc123.vercel.app
Task 3 complete. Continuing to task 4...
Key distinction:
- Pre-planned checkpoint: "I need you to do X" (wrong - Claude should automate)
- Auth gate: "I tried to automate X but need credentials" (correct - unblocks automation)
</authentication_gates>
<automation_reference>
The rule: If it has CLI/API, Claude does it. Never ask human to perform automatable work.
| Service | CLI/API | Key Commands | Auth Gate |
|---|---|---|---|
| Vercel | vercel |
--yes, env add, --prod, ls |
vercel login |
| Railway | railway |
init, up, variables set |
railway login |
| Fly | fly |
launch, deploy, secrets set |
fly auth login |
| Stripe | stripe + API |
listen, trigger, API calls |
API key in .env |
| Supabase | supabase |
init, link, db push, gen types |
supabase login |
| Upstash | upstash |
redis create, redis get |
upstash auth login |
| PlanetScale | pscale |
database create, branch create |
pscale auth login |
| GitHub | gh |
repo create, pr create, secret set |
gh auth login |
| Node | npm/pnpm |
install, run build, test |
N/A |
| Xcode | xcodebuild |
-project, -scheme, build, test |
N/A |
| Convex | npx convex |
dev, deploy, import |
npx convex login |
Env files: Use Write/Edit tools. Never ask human to create .env manually.
Quick reference:
| Action | Automatable? | Claude does it? |
|---|---|---|
| Deploy to Vercel | Yes (vercel) |
YES |
| Create Stripe webhook | Yes (API) | YES |
| Write .env file | Yes (Write tool) | YES |
| Create Upstash DB | Yes (upstash) |
YES |
| Run tests | Yes (npm test) |
YES |
| Click email verification link | No | NO |
| Enter credit card with 3DS | No | NO |
| Complete OAuth in browser | No | NO |
</automation_reference>
<writing_guidelines>
DO:
- Automate everything with CLI/API before checkpoint
- Be specific: "Visit https://myapp.vercel.app" not "check deployment"
- Number verification steps: easier to follow
- State expected outcomes: "You should see X"
- Provide context: why this checkpoint exists
- Make verification executable: clear, testable steps
DON'T:
- Ask human to do work Claude can automate (deploy, create resources, run builds)
- Assume knowledge: "Configure the usual settings" ❌
- Skip steps: "Set up database" ❌ (too vague)
- Mix multiple verifications in one checkpoint (split them)
- Make verification impossible (Claude can't check visual appearance without user confirmation)
Placement:
- After automation completes - not before Claude does the work
- After UI buildout - before declaring phase complete
- Before dependent work - decisions before implementation
- At integration points - after configuring external services
Bad placement:
- Before Claude automates (asking human to do automatable work) ❌
- Too frequent (every other task is a checkpoint) ❌
- Too late (checkpoint is last task, but earlier tasks needed its result) ❌ </writing_guidelines>
Example 1: Deployment Flow (Correct)
<!-- Claude automates everything -->
<task type="auto">
<name>Deploy to Vercel</name>
<files>.vercel/, vercel.json, package.json</files>
<action>
1. Run `vercel --yes` to create project and deploy
2. Capture deployment URL from output
3. Set environment variables with `vercel env add`
4. Trigger production deployment with `vercel --prod`
</action>
<verify>
- vercel ls shows deployment
- curl {url} returns 200
- Environment variables set correctly
</verify>
<done>App deployed to production, URL captured</done>
</task>
<!-- Human verifies visual/functional correctness -->
<task type="checkpoint:human-verify" gate="blocking">
<what-built>Deployed to https://myapp.vercel.app</what-built>
<how-to-verify>
Visit https://myapp.vercel.app and confirm:
- Homepage loads correctly
- All images/assets load
- Navigation works
- No console errors
</how-to-verify>
<resume-signal>Type "approved" or describe issues</resume-signal>
</task>
Example 2: Database Setup (No Checkpoint Needed)
<!-- Claude automates everything -->
<task type="auto">
<name>Create Upstash Redis database</name>
<files>.env</files>
<action>
1. Run `upstash redis create myapp-cache --region us-east-1`
2. Capture connection URL from output
3. Write to .env: UPSTASH_REDIS_URL={url}
4. Verify connection with test command
</action>
<verify>
- upstash redis list shows database
- .env contains UPSTASH_REDIS_URL
- Test connection succeeds
</verify>
<done>Redis database created and configured</done>
</task>
<!-- NO CHECKPOINT NEEDED - Claude automated everything and verified programmatically -->
Example 3: Stripe Webhooks (Correct)
<!-- Claude automates everything -->
<task type="auto">
<name>Configure Stripe webhooks</name>
<files>.env, src/app/api/webhooks/route.ts</files>
<action>
1. Use Stripe API to create webhook endpoint pointing to /api/webhooks
2. Subscribe to events: payment_intent.succeeded, customer.subscription.updated
3. Save webhook signing secret to .env
4. Implement webhook handler in route.ts
</action>
<verify>
- Stripe API returns webhook endpoint ID
- .env contains STRIPE_WEBHOOK_SECRET
- curl webhook endpoint returns 200
</verify>
<done>Stripe webhooks configured and handler implemented</done>
</task>
<!-- Human verifies in Stripe dashboard -->
<task type="checkpoint:human-verify" gate="blocking">
<what-built>Stripe webhook configured via API</what-built>
<how-to-verify>
Visit Stripe Dashboard > Developers > Webhooks
Confirm: Endpoint shows https://myapp.com/api/webhooks with correct events
</how-to-verify>
<resume-signal>Type "yes" if correct</resume-signal>
</task>
Example 4: Full Auth Flow Verification (Correct)
<task type="auto">
<name>Create user schema</name>
<files>src/db/schema.ts</files>
<action>Define User, Session, Account tables with Drizzle ORM</action>
<verify>npm run db:generate succeeds</verify>
</task>
<task type="auto">
<name>Create auth API routes</name>
<files>src/app/api/auth/[...nextauth]/route.ts</files>
<action>Set up NextAuth with GitHub provider, JWT strategy</action>
<verify>TypeScript compiles, no errors</verify>
</task>
<task type="auto">
<name>Create login UI</name>
<files>src/app/login/page.tsx, src/components/LoginButton.tsx</files>
<action>Create login page with GitHub OAuth button</action>
<verify>npm run build succeeds</verify>
</task>
<!-- ONE checkpoint at end verifies the complete flow -->
<task type="checkpoint:human-verify" gate="blocking">
<what-built>Complete authentication flow (schema + API + UI)</what-built>
<how-to-verify>
1. Run: npm run dev
2. Visit: http://localhost:3000/login
3. Click "Sign in with GitHub"
4. Complete GitHub OAuth flow
5. Verify: Redirected to /dashboard, user name displayed
6. Refresh page: Session persists
7. Click logout: Session cleared
</how-to-verify>
<resume-signal>Type "approved" or describe issues</resume-signal>
</task>
<anti_patterns>
❌ BAD: Asking human to automate
<task type="checkpoint:human-action" gate="blocking">
<action>Deploy to Vercel</action>
<instructions>
1. Visit vercel.com/new
2. Import Git repository
3. Click Deploy
4. Copy deployment URL
</instructions>
<verification>Deployment exists</verification>
<resume-signal>Paste URL</resume-signal>
</task>
Why bad: Vercel has a CLI. Claude should run vercel --yes.
✅ GOOD: Claude automates, human verifies
<task type="auto">
<name>Deploy to Vercel</name>
<action>Run `vercel --yes`. Capture URL.</action>
<verify>vercel ls shows deployment, curl returns 200</verify>
</task>
<task type="checkpoint:human-verify">
<what-built>Deployed to {url}</what-built>
<how-to-verify>Visit {url}, check homepage loads</how-to-verify>
<resume-signal>Type "approved"</resume-signal>
</task>
❌ BAD: Too many checkpoints
<task type="auto">Create schema</task>
<task type="checkpoint:human-verify">Check schema</task>
<task type="auto">Create API route</task>
<task type="checkpoint:human-verify">Check API</task>
<task type="auto">Create UI form</task>
<task type="checkpoint:human-verify">Check form</task>
Why bad: Verification fatigue. Combine into one checkpoint at end.
✅ GOOD: Single verification checkpoint
<task type="auto">Create schema</task>
<task type="auto">Create API route</task>
<task type="auto">Create UI form</task>
<task type="checkpoint:human-verify">
<what-built>Complete auth flow (schema + API + UI)</what-built>
<how-to-verify>Test full flow: register, login, access protected page</how-to-verify>
<resume-signal>Type "approved"</resume-signal>
</task>
❌ BAD: Asking for automatable file operations
<task type="checkpoint:human-action">
<action>Create .env file</action>
<instructions>
1. Create .env in project root
2. Add: DATABASE_URL=...
3. Add: STRIPE_KEY=...
</instructions>
</task>
Why bad: Claude has Write tool. This should be type="auto".
❌ BAD: Vague verification steps
<task type="checkpoint:human-verify">
<what-built>Dashboard</what-built>
<how-to-verify>Check it works</how-to-verify>
<resume-signal>Continue</resume-signal>
</task>
Why bad: No specifics. User doesn't know what to test or what "works" means.
✅ GOOD: Specific verification steps
<task type="checkpoint:human-verify">
<what-built>Responsive dashboard at /dashboard</what-built>
<how-to-verify>
1. Run: npm run dev
2. Visit: http://localhost:3000/dashboard
3. Desktop (>1024px): Sidebar visible, content area fills remaining space
4. Tablet (768px): Sidebar collapses to icons
5. Mobile (375px): Sidebar hidden, hamburger menu in header
6. Check: No horizontal scroll at any size
</how-to-verify>
<resume-signal>Type "approved" or describe layout issues</resume-signal>
</task>
</anti_patterns>
Checkpoints formalize human-in-the-loop points. Use them when Claude cannot complete a task autonomously OR when human verification is required for correctness.
The golden rule: If Claude CAN automate it, Claude MUST automate it.
Checkpoint priority:
- checkpoint:human-verify (90% of checkpoints) - Claude automated everything, human confirms visual/functional correctness
- checkpoint:decision (9% of checkpoints) - Human makes architectural/technology choices
- checkpoint:human-action (1% of checkpoints) - Truly unavoidable manual steps with no API/CLI
When NOT to use checkpoints:
- Things Claude can verify programmatically (tests pass, build succeeds)
- File operations (Claude can read files to verify)
- Code correctness (use tests and static analysis)
- Anything automatable via CLI/API