# LemonTwin Phase A (Production Stabilization) Implementation Plan > **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. **Goal:** Stabilize LemonTwin for orchard-farmer production use with buyout+after-sales operations, local GPT-OSS-first inference, and OpenClaw-governed risk-management workflows. **Architecture:** Keep the current single-host topology on `192.168.1.83`, but remove dev-mode runtime risks. Promote web to production startup mode, keep one Cloudflare tunnel, add API readiness checks tied to database health, and add daily backup + operational guardrails. Governance and QA run through OpenClaw agents with explicit KPI gates. **Tech Stack:** Next.js, NestJS, Prisma, SQLite (Phase A), Cloudflared, launchd (macOS), OpenClaw (local GPT-OSS 20b), shell scripts. --- ### Task 1: Baseline + Change Safety (Host 192.168.1.83) **Files:** - Create: `/Users/joycechen/LemonTwin/ops/baseline/prechange-.md` - Create: `/Users/joycechen/LemonTwin/ops/scripts/capture-baseline.sh` - Modify: `/Users/joycechen/LemonTwin/README.md` - [ ] **Step 1: Write the failing test (baseline checklist must exist before changes)** ```bash sshpass -p 1234 ssh -o StrictHostKeyChecking=no joycechen@192.168.1.83 ' test -f ~/LemonTwin/ops/scripts/capture-baseline.sh && echo PASS || echo FAIL ' ``` Expected: `FAIL` - [ ] **Step 2: Create baseline script** ```bash sshpass -p 1234 ssh -o StrictHostKeyChecking=no joycechen@192.168.1.83 ' mkdir -p ~/LemonTwin/ops/scripts ~/LemonTwin/ops/baseline cat > ~/LemonTwin/ops/scripts/capture-baseline.sh << "EOF" #!/usr/bin/env bash set -euo pipefail TS="$(date +%Y%m%d-%H%M%S)" OUT="$HOME/LemonTwin/ops/baseline/prechange-$TS.md" { echo "# LemonTwin Baseline $TS" echo "## Web/API process" ps aux | grep -E "next|tsx|cloudflared" | grep -v grep || true echo "## Ports" lsof -nP -iTCP -sTCP:LISTEN | grep -E ":3010|:3001" || true echo "## Site probes" curl -sS -I https://orchard.graceai.net | head -n 5 curl -sS http://127.0.0.1:3001/health || true echo } > "$OUT" echo "$OUT" EOF chmod +x ~/LemonTwin/ops/scripts/capture-baseline.sh ' ``` - [ ] **Step 3: Run test to verify script works** ```bash sshpass -p 1234 ssh -o StrictHostKeyChecking=no joycechen@192.168.1.83 ' ~/LemonTwin/ops/scripts/capture-baseline.sh ' ``` Expected: prints created markdown path under `ops/baseline/`. - [ ] **Step 4: Document runbook entry** ```markdown ## Ops Baseline Capture Before any deployment or service change: 1. Run `~/LemonTwin/ops/scripts/capture-baseline.sh` 2. Confirm output file exists in `ops/baseline/` 3. Attach file path in change ticket. ``` - [ ] **Step 5: Commit** ```bash sshpass -p 1234 ssh -o StrictHostKeyChecking=no joycechen@192.168.1.83 ' cd ~/LemonTwin git init >/dev/null 2>&1 || true git add ops/scripts/capture-baseline.sh README.md ops/baseline/*.md git commit -m "ops: add pre-change baseline capture" || true ' ``` --- ### Task 2: Web Production Startup (replace `next dev`) **Files:** - Modify: `/Users/joycechen/LemonTwin/web/package.json` - Create: `/Users/joycechen/LemonTwin/ops/launchd/net.graceai.lemontwin-web.plist` - Test: `/Users/joycechen/LemonTwin/ops/scripts/verify-web-prod.sh` - [ ] **Step 1: Write the failing test (web currently in dev mode)** ```bash sshpass -p 1234 ssh -o StrictHostKeyChecking=no joycechen@192.168.1.83 ' curl -sS https://orchard.graceai.net | grep -q "__next_dev_client__" && echo FAIL || echo PASS ' ``` Expected before fix: `FAIL` - [ ] **Step 2: Write minimal implementation** ```bash sshpass -p 1234 ssh -o StrictHostKeyChecking=no joycechen@192.168.1.83 ' cd ~/LemonTwin/web node -e " const fs=require(\"fs\"); const p=\"package.json\"; const j=JSON.parse(fs.readFileSync(p,\"utf8\")); j.scripts=j.scripts||{}; j.scripts[\"start:prod\"]=\"next start -p 3010\"; fs.writeFileSync(p, JSON.stringify(j,null,2)+\"\\n\"); " ' ``` - [ ] **Step 3: Add launchd service** ```bash sshpass -p 1234 ssh -o StrictHostKeyChecking=no joycechen@192.168.1.83 ' mkdir -p ~/LemonTwin/ops/launchd cat > ~/LemonTwin/ops/launchd/net.graceai.lemontwin-web.plist << "EOF" Labelnet.graceai.lemontwin-web ProgramArguments /bin/zsh -lc cd ~/LemonTwin/web && npm ci && npm run build && npm run start:prod RunAtLoad KeepAlive StandardOutPath/Users/joycechen/LemonTwin/web/logs/web.out.log StandardErrorPath/Users/joycechen/LemonTwin/web/logs/web.err.log EOF mkdir -p ~/LemonTwin/web/logs ' ``` - [ ] **Step 4: Run test to verify it passes** ```bash sshpass -p 1234 ssh -o StrictHostKeyChecking=no joycechen@192.168.1.83 ' launchctl unload ~/Library/LaunchAgents/net.graceai.lemontwin-web.plist 2>/dev/null || true cp ~/LemonTwin/ops/launchd/net.graceai.lemontwin-web.plist ~/Library/LaunchAgents/ launchctl load ~/Library/LaunchAgents/net.graceai.lemontwin-web.plist sleep 15 curl -sS https://orchard.graceai.net | grep -q "__next_dev_client__" && echo FAIL || echo PASS ' ``` Expected: `PASS` - [ ] **Step 5: Commit** ```bash sshpass -p 1234 ssh -o StrictHostKeyChecking=no joycechen@192.168.1.83 ' cd ~/LemonTwin git add web/package.json ops/launchd/net.graceai.lemontwin-web.plist git commit -m "infra: run web in production mode via launchd" ' ``` --- ### Task 3: Single Cloudflare Tunnel + Process Cleanup **Files:** - Modify: `/Users/joycechen/LemonTwin/web/cloudflared-config.yml` - Create: `/Users/joycechen/LemonTwin/ops/scripts/reconcile-cloudflared.sh` - Test: `/Users/joycechen/LemonTwin/ops/scripts/verify-tunnel-single.sh` - [ ] **Step 1: Write the failing test (multiple tunnel processes)** ```bash sshpass -p 1234 ssh -o StrictHostKeyChecking=no joycechen@192.168.1.83 ' N=$(pgrep -fl cloudflared | wc -l | tr -d " ") [ "$N" -eq 1 ] && echo PASS || echo FAIL:$N ' ``` Expected before fix: `FAIL:` - [ ] **Step 2: Write reconciliation script** ```bash sshpass -p 1234 ssh -o StrictHostKeyChecking=no joycechen@192.168.1.83 ' cat > ~/LemonTwin/ops/scripts/reconcile-cloudflared.sh << "EOF" #!/usr/bin/env bash set -euo pipefail pkill -f "cloudflared tunnel --url" || true pkill -f "/etc/cloudflared/config.yml" || true pkill -f "cloudflared.*web/cloudflared-config.yml" || true nohup cloudflared tunnel --config "$HOME/LemonTwin/web/cloudflared-config.yml" run >/tmp/lemontwin-cloudflared.log 2>&1 & sleep 2 pgrep -fl cloudflared EOF chmod +x ~/LemonTwin/ops/scripts/reconcile-cloudflared.sh ' ``` - [ ] **Step 3: Run test to verify it passes** ```bash sshpass -p 1234 ssh -o StrictHostKeyChecking=no joycechen@192.168.1.83 ' ~/LemonTwin/ops/scripts/reconcile-cloudflared.sh >/dev/null N=$(pgrep -fl cloudflared | wc -l | tr -d " ") [ "$N" -eq 1 ] && echo PASS || echo FAIL:$N ' ``` Expected: `PASS` - [ ] **Step 4: Validate external reachability** ```bash curl -sS -I https://orchard.graceai.net | head -n 1 ``` Expected: `HTTP/2 200` (or redirect to app route) - [ ] **Step 5: Commit** ```bash sshpass -p 1234 ssh -o StrictHostKeyChecking=no joycechen@192.168.1.83 ' cd ~/LemonTwin git add web/cloudflared-config.yml ops/scripts/reconcile-cloudflared.sh git commit -m "infra: enforce single cloudflared tunnel process" ' ``` --- ### Task 4: API Health + Readiness with DB Probe **Files:** - Modify: `/Users/joycechen/LemonTwin/api/src/app.controller.ts` - Modify: `/Users/joycechen/LemonTwin/api/src/app.module.ts` - Test: `/Users/joycechen/LemonTwin/api/test/health.e2e-spec.ts` - [ ] **Step 1: Write the failing test** ```ts import request from 'supertest'; import { Test } from '@nestjs/testing'; import { AppModule } from '../src/app.module'; import { INestApplication } from '@nestjs/common'; describe('health endpoints', () => { let app: INestApplication; beforeAll(async () => { const m = await Test.createTestingModule({ imports: [AppModule] }).compile(); app = m.createNestApplication(); await app.init(); }); afterAll(async () => app.close()); it('/readyz returns db status', async () => { const res = await request(app.getHttpServer()).get('/readyz'); expect(res.status).toBe(200); expect(res.body).toHaveProperty('ok', true); expect(res.body).toHaveProperty('db', 'up'); }); }); ``` - [ ] **Step 2: Run test to verify it fails** ```bash sshpass -p 1234 ssh -o StrictHostKeyChecking=no joycechen@192.168.1.83 ' cd ~/LemonTwin/api && npm test -- health.e2e-spec.ts ' ``` Expected: FAIL because `/readyz` missing. - [ ] **Step 3: Write minimal implementation** ```ts // api/src/app.controller.ts import { Controller, Get } from '@nestjs/common'; import { PrismaService } from './prisma.service'; @Controller() export class AppController { constructor(private readonly prisma: PrismaService) {} @Get('health') health() { return { ok: true, service: 'lemontwin-api' }; } @Get('readyz') async readyz() { await this.prisma.$queryRaw`SELECT 1`; return { ok: true, db: 'up' }; } } ``` - [ ] **Step 4: Run tests to verify pass** ```bash sshpass -p 1234 ssh -o StrictHostKeyChecking=no joycechen@192.168.1.83 ' cd ~/LemonTwin/api && npm test -- health.e2e-spec.ts ' ``` Expected: PASS - [ ] **Step 5: Commit** ```bash sshpass -p 1234 ssh -o StrictHostKeyChecking=no joycechen@192.168.1.83 ' cd ~/LemonTwin git add api/src/app.controller.ts api/src/app.module.ts api/test/health.e2e-spec.ts git commit -m "api: add /health and /readyz with db probe" ' ``` --- ### Task 5: Daily SQLite Backup + Restore Drill **Files:** - Create: `/Users/joycechen/LemonTwin/ops/scripts/backup-sqlite.sh` - Create: `/Users/joycechen/LemonTwin/ops/scripts/restore-drill-sqlite.sh` - Create: `/Users/joycechen/LemonTwin/ops/launchd/net.graceai.lemontwin-backup.plist` - [ ] **Step 1: Write the failing test** ```bash sshpass -p 1234 ssh -o StrictHostKeyChecking=no joycechen@192.168.1.83 ' test -f ~/LemonTwin/ops/scripts/backup-sqlite.sh && echo PASS || echo FAIL ' ``` Expected: `FAIL` - [ ] **Step 2: Write backup and restore scripts** ```bash sshpass -p 1234 ssh -o StrictHostKeyChecking=no joycechen@192.168.1.83 ' mkdir -p ~/LemonTwin/ops/scripts ~/LemonTwin/backups/sqlite cat > ~/LemonTwin/ops/scripts/backup-sqlite.sh << "EOF" #!/usr/bin/env bash set -euo pipefail SRC="$HOME/LemonTwin/api/dev.db" TS="$(date +%Y%m%d-%H%M%S)" DST="$HOME/LemonTwin/backups/sqlite/dev-$TS.db" cp "$SRC" "$DST" echo "$DST" EOF cat > ~/LemonTwin/ops/scripts/restore-drill-sqlite.sh << "EOF" #!/usr/bin/env bash set -euo pipefail LATEST="$(ls -1t $HOME/LemonTwin/backups/sqlite/dev-*.db | head -n 1)" sqlite3 "$LATEST" "SELECT name FROM sqlite_master LIMIT 1;" >/tmp/lemontwin-restore-drill.txt cat /tmp/lemontwin-restore-drill.txt EOF chmod +x ~/LemonTwin/ops/scripts/backup-sqlite.sh ~/LemonTwin/ops/scripts/restore-drill-sqlite.sh ' ``` - [ ] **Step 3: Add launchd schedule** ```bash sshpass -p 1234 ssh -o StrictHostKeyChecking=no joycechen@192.168.1.83 ' cat > ~/LemonTwin/ops/launchd/net.graceai.lemontwin-backup.plist << "EOF" Labelnet.graceai.lemontwin-backup ProgramArguments /bin/zsh-lc~/LemonTwin/ops/scripts/backup-sqlite.sh StartCalendarIntervalHour3Minute0 StandardOutPath/Users/joycechen/LemonTwin/backups/sqlite/backup.out.log StandardErrorPath/Users/joycechen/LemonTwin/backups/sqlite/backup.err.log EOF ' ``` - [ ] **Step 4: Run backup + restore drill** ```bash sshpass -p 1234 ssh -o StrictHostKeyChecking=no joycechen@192.168.1.83 ' ~/LemonTwin/ops/scripts/backup-sqlite.sh ~/LemonTwin/ops/scripts/restore-drill-sqlite.sh ' ``` Expected: backup file path + one table name printed. - [ ] **Step 5: Commit** ```bash sshpass -p 1234 ssh -o StrictHostKeyChecking=no joycechen@192.168.1.83 ' cd ~/LemonTwin git add ops/scripts/backup-sqlite.sh ops/scripts/restore-drill-sqlite.sh ops/launchd/net.graceai.lemontwin-backup.plist git commit -m "ops: add daily sqlite backup and restore drill" ' ``` --- ### Task 6: OpenClaw Governance + KPI Gate **Files:** - Create: `/Users/joycechen/LemonTwin/docs/openclaw-governance.md` - Create: `/Users/joycechen/LemonTwin/docs/kpi-gates.md` - Create: `/Users/joycechen/LemonTwin/ops/scripts/openclaw-weekly-review.sh` - [ ] **Step 1: Write the failing test** ```bash sshpass -p 1234 ssh -o StrictHostKeyChecking=no joycechen@192.168.1.83 ' test -f ~/LemonTwin/docs/openclaw-governance.md && test -f ~/LemonTwin/docs/kpi-gates.md && echo PASS || echo FAIL ' ``` Expected: `FAIL` - [ ] **Step 2: Write governance document** ```markdown # OpenClaw Governance (LemonTwin) - Primary model: `ollama/gpt-oss:20b` (local-first) - Agents: PM (roadmap), RD (schema/routing), QA (acceptance), CEO (release gate) - Rule: no production change without baseline capture + KPI gate pass - Domains: orchard automation, cultivation evolution, disaster/pest risk ``` - [ ] **Step 3: Write KPI gate document** ```markdown # KPI Gates - Uptime >= 99.0% (7-day rolling) - Alert precision >= 0.75 for pest-risk alerts - Median alert-to-action <= 30 minutes - Daily backup success rate = 100% - CSAT for after-sales >= 4.2/5 Release blocked if any KPI is below target for 3 consecutive days. ``` - [ ] **Step 4: Create weekly review script** ```bash sshpass -p 1234 ssh -o StrictHostKeyChecking=no joycechen@192.168.1.83 ' cat > ~/LemonTwin/ops/scripts/openclaw-weekly-review.sh << "EOF" #!/usr/bin/env bash set -euo pipefail echo "[weekly-review] $(date)" echo "Review docs: docs/openclaw-governance.md docs/kpi-gates.md" echo "Run health probes and backup drill before release gate." EOF chmod +x ~/LemonTwin/ops/scripts/openclaw-weekly-review.sh ' ``` - [ ] **Step 5: Commit** ```bash sshpass -p 1234 ssh -o StrictHostKeyChecking=no joycechen@192.168.1.83 ' cd ~/LemonTwin git add docs/openclaw-governance.md docs/kpi-gates.md ops/scripts/openclaw-weekly-review.sh git commit -m "docs: add openclaw governance and KPI release gates" ' ``` --- ## Spec Coverage Check - Farmer-first target + buyout/after-sales model: covered in Task 6 governance and KPI definitions. - Local GPT-OSS-first architecture: covered in Task 6 governance; enforcement starts in Phase B routing implementation. - Shared OpenClaw + skills governance: covered in Task 6. - Immediate Phase A stabilization: covered in Tasks 1–5 (runtime, health, backup, tunnel/process cleanup). - Disaster/pest risk management readiness: KPI and governance introduced in Task 6; data-model implementation planned for Phase B. ## Placeholder Scan - No `TODO`, `TBD`, or “implement later” markers in tasks. - Each code-change step includes explicit command or code content. ## Type Consistency Check - `/health` and `/readyz` endpoint names are consistent across test and implementation steps. - Backup script names and launchd label references match exactly across tasks.