← Logs 2026년 2월 25일

2026-02-25 Session Log

lingua-rag planning + F1 implementation (competitive-agents), planning-interview v2.0 integration, business-avengers v2.1 audit, truth-checker idea, plugin-tester design...

lingua-ragbusiness-avengersplanning-interviewplugin-testercompetitive-agentslingua-rag-planning-docstruth-checker

lingua-rag (01:34)

Completed planning for LinguaRAG, the AI Product Engineer killer project — Lean Canvas → Product Brief (Startup mode) → reflected Dokdok A1 structure

Done

docs: created projects/lingua-rag/lean-canvas-lingua-rag-20260225.md (planning-interview Solo mode)
- Key insight: user value = pattern-based repetitive listening/speaking practice (not simple Q&A)
- North Star Metric: number of practice sentence set repetition completions
docs: created projects/lingua-rag/product-brief-lingua-rag-v01-20260225.md (Startup mode)
- Scope: v0.1 (Week 1-4) — German Q&A without PDF + unit dashboard + level mapping
- Feature list: F1~F8 (Must/Should/Could), API design, DB schema, 3 beta tester survey hypotheses
refactor(product-brief): full replacement from Netzwerk A1 → Dokdok A1 structure — 56 units / 8 Bands, reflecting Jay’s actual progress (A1-1~A1-20 completed)
refactor: updated v0.2 section to focus on Dokdok PDF integration (copyright handling: user upload → delete after processing)

Key Decisions

Adopted Dokdok unit structure as v0.1 baseline: progressive strategy — v0.1 hardcoded → v0.2 replaced with actual PDF
Introduced unit type classification: grammar/vocabulary/conversation/practice — optimize system prompts by type
Expanded v0.1 scope: Streaming Q&A only → includes unit navigation + level mapping. Week 1-2 Q&A first, Week 3-4 dashboard
Copyright strategy: user uploads their own purchased PDF, no permanent server storage

Write detailed feature spec for F1 (Claude Streaming Q&A) with spec-interview → commission implementation to competitive-agents
Resolve Open Questions (before starting v0.1 development)
Create GitHub repo lingua-rag and make initial commit

business-avengers (14:51)

Completed remaining 37 v2.1 audit items + added Phase 0 idea document input feature

Done

feat: completed all remaining v2.1 audit items
- I2: added GTM strategy context to Phase 9 CS Manager
- I3: separated sprint_context variables for frontend/backend in Phase 5 (sprint_context_frontend, sprint_context_backend)
- I5: included sprint_goal in Phase 11 sprint_context
- I7: integrated sprint-review.md template-based review generation into Step 20 (Sprint Completion)
- I8: fixed undefined workflow variable bug in RESUME mode
- I10: added explicit phase execution loop mapping (Phase 0→Step 4, …, Phase 12→Step 16)
- I11: added completion routing (Sprint→Step 20, Orchestra→Step 21)
- M1-M9+UX4: synced org-structure.yaml and minor cleanup
feat: added Phase 0 idea document input feature
- CPO first asks “Do you have an existing document?” branching question
- If file path provided: loads via Glob+Read / if text pasted: uses as-is
- If file not found: text fallback + guidance message

Key Decisions

Document input method: adopted natural branching within Phase 0 CPO conversation instead of --from-doc command flag (UX-first)
File path detection: contains / or \ AND extension check (.md, .txt, .pdf, .docx)

E2E full flow test in a new session (Phase 0→1→2)
- Verify document input branching (both file path / text paste paths)
- Run 3 parallel Phase 1 agents and verify output

lingua-rag (14:52)

Completed full F1 Competitive Agents pipeline — Alpha vs Beta 2-round competition followed by Fuse, initial push to lingua-rag repo

Done

feat: completed F1 spec-interview → created projects/lingua-rag/docs/f1-streaming-qa-spec.md
feat: ran full 7-phase Competitive Agents pipeline
- Phase 1: Alpha (Pragmatist) v1, Beta (Architect) v1 generated in parallel
- Phase 2~5: 2 rounds of cross-review + improvements — Alpha 82.5/100, Beta 82.5/100 (tied)
- Phase 6: Judge — Alpha 74.0 / Beta 67.0 (Alpha wins; decisive difference: Beta lacks frontend)
- Phase 7: selected “Fuse A + B” → created tempo/competitive-agents/lingua-rag-f1/fused/
feat: Fused implementation (39 files)
- Beta backend layered architecture + Alpha frontend adopted
- backend/app/routers/chat.py: moved history fetch + user persist inside session_lock, LRU 1,000-entry cap
- frontend/hooks/useChat.ts: added truncated event handling + warning UI
feat: cloned GitHub repo lingua-rag → copied 40 fused files → pushed
docs: wrote README.md (tech stack, local dev guide, API endpoints, design decisions)

Key Decisions

Fuse strategy: Alpha won but Beta’s layered architecture quality was superior → backend from Beta, frontend from Alpha
asyncio.Lock scope: both history fetch and user message persist inside lock — prevents race condition (Critical issue from cross-review)
—workers 1 fixed: asyncio.Lock is only valid within a single process

Set up local dev environment and verify operation (backend + frontend connection test)
Deploy backend to Railway + deploy frontend to Vercel
Write beta tester recruitment post (Naver Cafe “German Study” + Reddit r/German)

planning-interview (15:06)

Implemented planning-interview v2.0 integrated planning plugin — 4-Phase single flow + added Context Import step

Done

feat: finalized planning-interview + spec-interview integration design — Mode only determines interview depth, document selection is a separate step
feat(planning-interview): full rewrite of SKILL.md v2.0
- Phase 1: PRD (Lean Canvas / Product Brief / Full PRD)
- Phase 2: User Journey Map (new)
- Phase 3: Technical Specification (ported from spec-interview)
- Phase 4: Wireframe Specification (new)
- Mode×Phase question count matrix: Solo(4/3/4/3), Startup(6/5/6/4), Team(9/7/9/6)
- Session state v2.0: phases_selected, phase_state[1-4], shared_context
- Cross-phase context: Phase 1 answers referenced in Phases 2-4
feat(planning-interview): created 3 new templates — user-journey-map.md, tech-spec.md, wireframe-spec.md
chore: updated 3 existing templates to v2.0 (lean-canvas, product-brief, full-prd)
chore: updated plugin.json v1.0.0 → v2.0.0
fix: re-ran link-local.sh to reconnect symlinks (replaced file copies with symbolic links)
feat(planning-interview): added Step 2.5 Context Import
- Auto-detects when trigger contains content (~50+ words), extracts without asking questions
- Extracted items: problem, solution, users, features, tech_stack → connected to interview question skip logic
- Expected question reduction: 6Q → 2-3Q (Solo/Startup), 9Q → 4-5Q (Team)

Key Decisions

Document removal judgment (over-engineering): SDD → absorbed into Tech Spec Architecture section, standalone IA file → absorbed into Wireframe Spec
Context Import UX: skip questions when trigger contains content, only ask upfront when trigger is short

Audit all AskUserQuestion freeform question options for minimum 2-option requirement
E2E test in a new session (verify Context Import behavior)

planning-interview (15:19)

Audited and fixed AskUserQuestion constraints — clarified freeform question handling, fixed Step 5 option count overflow bug

Done

fix(planning-interview): fixed Step 5 Document Selection option count overflow — 5 options → 4 options (complies with AskUserQuestion max 4 constraint)
fix(planning-interview): added Interview Question Convention section
- Clarified that allow_freeform=true notation in Steps 7–14 interview questions is pseudo-code
- Added rule: actual AskUserQuestion tool calls prohibited, handle as plain text output
- Clarified two question types (structured choice vs freeform response) in a table

Key Decisions

Freeform question handling strategy: declared rules in a single Convention section instead of modifying 60+ individual question blocks — prioritizes maintainability

Run /planning-interview E2E validation in a new session

plugin-tester (17:16)

Refined plugin-tester plugin idea — designed 5-agent architecture and wrote design document

Done

docs: analyzed limits of deterministic testing for LLM-driven plugins — defined two layers: “document consistency verification” and “execution verification”
docs: finalized verification scope — SKILL.md, agent files, README.md, plugin.json + cross-validation items
docs: designed 5-agent architecture
- doc-collector (haiku) → static-linter (sonnet) → behavior-reviewer (sonnet) → scenario-runner (sonnet) → test-reporter (haiku)
- Phase 1 (serial) → Phase 2 (parallel) → Phase 3 (simulation) → Phase 4 (report)
docs: designed scoring system — 10-point checklist-based, PASS / WARN / FAIL grades
docs: completed docs/plugin-tester-design.md

Key Decisions

v1.0 scope reduction: start with static-linter + behavior-reviewer only — highest reliability, lowest cost. scenario-runner added in v1.1
Checklist-based scoring: explicit criteria scoring instead of LLM subjective judgment → ensures reproducibility

Begin actual plugin-tester plugin implementation
Write v1.0 agent files (static-linter, behavior-reviewer)
Write execution algorithm in skills/plugin-tester/SKILL.md
Run pilot test against existing plugins

lingua-rag (17:16)

Added Mermaid diagrams to lingua-rag repo README + moved to docs/, verified competitive-agents SKILL.md Step 11.5 improvement

Done

docs: added 3 Mermaid diagrams to README.md
- System architecture (graph TB): browser → Vercel → Railway → DB → Claude full stack
- SSE chat flow (sequenceDiagram): complete streaming flow
- DB schema (erDiagram): sessions ↔ conversations ↔ messages relationship diagram
docs: moved 3 files from projects/lingua-rag/docs/ to lingua-rag repo /docs/ and pushed
- decisions.md (ADR-001~004), f1-streaming-qa-spec.md, dev-log.md
chore: confirmed competitive-agents SKILL.md Step 11.5 improvements applied (4 edits verified via grep)

Record first dev session in dev-log.md (write when starting local environment setup)

competitive-agents (17:16)

Added competitive-agents SKILL.md Step 11.5 — auto-generates project docs (ADR, dev-log, spec) in pipeline final output

Done

feat: added “Project Docs” question to Step 1 (presented alongside number of rounds)
- “Yes — generate docs/ (Recommended)”: generates decisions.md, dev-log.md, spec.md in final/docs/
- “No — skip docs”: skip for quick experiments or plugin-only cases
feat: added new Step 11.5 (between Step 11 Execute Decision and Step 12 Completion Summary)
- final/docs/decisions.md: auto-extracts 3~6 ADRs from judge-report Strengths to Preserve + per-criterion scores + mission
- final/docs/dev-log.md: empty dev log template
- final/docs/spec.md: copied only when spec file is provided
- Claude generates inline without subagents (mission and judge-report already in context)
docs: updated Step 12 completion summary + Quick Reference output tree

Key Decisions

Adopted inline generation: judge-report and mission already in memory, so generate directly without separate agents. Faster with no context loss
ADR source = judge-report: “Strengths to Preserve”, “Why Winner”, per-criterion analysis naturally captures architecture decisions
Opt-in approach: recommended by default but selectable → accommodates quick experiment cases
Trigger: LinguaRAG F1 fused output had no decisions.md, requiring manual writing during repo migration → automated

Verify Step 11.5 behavior after competitive-agents run (select generate_docs = Yes)
- Confirm decisions.md ADR quality
- Confirm dev-log.md template format
- Confirm spec.md copy behavior when spec file is provided
Standardize decisions.md ADR format

lingua-rag-planning-docs (18:15)

planning-interview v2.0 E2E test — generated 4 planning documents for LinguaRAG (PRD, User Journey Map, Tech Spec, Wireframe Spec)

Done

docs(lingua-rag): generated 4 planning documents using planning-interview v2.0 Startup mode
- projects/lingua-rag/prd.md — Product Brief (imported existing Product Brief → added only 1 GTM question)
- projects/lingua-rag/user-journey-map.md — 3 Journeys (first visit/onboarding, regular Q&A, unit switching), 6-step onboarding ~2 min, retention loop
- projects/lingua-rag/tech-spec.md — architecture (Next.js 15 + FastAPI + PostgreSQL + Railway/Vercel), 6 FRs, 3-Phase implementation plan, NFR
- projects/lingua-rag/wireframe-spec.md — 4 screen ASCII layouts, responsive/accessibility

Key Decisions

Unit panel UI: tree structure (textbook > Band > unit expand/collapse) — better unit context visibility than dropdown
Input field character counter: real-time display at bottom-right (“48/500”) — more predictable than showing only on overflow
v0.1 Auth strategy: none (single user private) → plan to add NextAuth.js email Magic Link in v0.3
Context window: sliding window of last 10 messages — consider summary compression in v0.2

Start Phase 1 implementation based on tech-spec.md (FastAPI POST /api/chat SSE)
Implement chat UI based on wireframe-spec.md (Next.js 15)

planning-interview (18:15)

Completed full planning-interview E2E validation — successfully generated 4 documents after importing LinguaRAG Product Brief

Done

test(planning-interview): validated full E2E flow (Startup Mode, 4-Phase, existing document import path)
- Step 2.5 Context Import: selected “has existing document” → LinguaRAG Product Brief import working correctly ✅
- shouldSkipQuestion: 5 of 6 Phase 1 questions auto-skipped (pre-filled), only 1 GTM question asked ✅
- Phase 2 interview: plain text questions output correctly (AskUserQuestion not called), 2 questions ✅
- Phase 3/4 interview: skip logic + remaining questions handled correctly ✅
feat(lingua-rag): completed 4 planning documents via planning-interview

Key Decisions

Phase 4 unit panel: adopted tree structure — easier to grasp unit context than dropdown
Input counter: always show real-time “48/500” at bottom-right

Consider SKILL.md improvements based on real-world planning-interview test feedback
- Clarity of handoff messages between phases
- Improve naturalness of user notification text when shouldSkipQuestion skips

truth-checker (20:30)

Derived Truth Checker false content analysis service idea and wrote comprehensive document — Korean-specialized fact-checking service

Done

feat: derived Truth Checker service idea
- Referenced and analyzed Goorm DeepDive, confirmed AI Hub clickbait article detection dataset (71,338 entries)
- Matched user’s tech stack (React, Claude API, Recharts, Supabase)
docs: wrote comprehensive idea document (truth-checker-service-idea.md, 30+ sections)
- Problem: misinformation flood, lack of fact-checking tools, market size
- Solution: 5-step verification pipeline
- Core Features: input (URL/text/image), claim extraction (Claude API), source credibility analysis (A-D grades), clickbait detection (AI Hub data), similar misinformation search, credibility score (0-100), visualization dashboard
- Business Model: Freemium ($0/$10/$30), B2B ($500-2k)

Key Decisions

Service name: Truth Checker
MVP duration: 4 weeks (Week 1: claim extraction + source verification, Week 2: Fact Checking engine, Week 3: visualization, Week 4: AI Hub data integration)
Core differentiation: Korean-specialized + supports all content types (URL/text/image) + fast verification (10-30 sec vs SNU FactCheck’s days)
Credibility score formula: source (30%) + cross-validation (40%) + logic (30%) = 100 points
Final tech stack: React 18+TypeScript+Vite, Recharts+React Flow, Node.js+Supabase, Claude API+KoBERT fine-tuned

Apply for AI Hub clickbait article dataset download
Initialize React project
Start MVP Week 1 development (URL input form, Cheerio crawling, Claude API claim extraction)
Detail MVP spec (consider using planning-interview skill)

Topics Worked On Today

lingua-rag (01:34)

Done

Key Decisions

Next

business-avengers (14:51)

Done

Key Decisions

Next

lingua-rag (14:52)

Done

Key Decisions

Next

planning-interview (15:06)

Done

Key Decisions

Next

planning-interview (15:19)

Done

Key Decisions

Next

plugin-tester (17:16)

Done

Key Decisions

Next

lingua-rag (17:16)

Done

Next

competitive-agents (17:16)

Done

Key Decisions

Next

lingua-rag-planning-docs (18:15)

Done

Key Decisions

Next

planning-interview (18:15)

Done

Key Decisions

Next

truth-checker (20:30)

Done

Key Decisions

Next