Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
a489a63
feat(data-pipeline): Instagram 엔티티 enrichment 파이프라인 추가 (#544)
CIOI May 18, 2026
bd0faad
feat(admin): improve group member management UI (#546)
CIOI May 18, 2026
981a25b
feat(web): add legal pages and minimal footer for Rakuten affiliate (…
CIOI May 20, 2026
51dc174
docs: Step 1 (ADR+architecture stub) + Step 2 (미분류 문서 매핑) — refs #518…
thxforall May 21, 2026
ca02428
fix(vton): handle Vertex AI RAI filter error (#535)
thxforall May 21, 2026
e36f925
ci(github): dispatch git events to decoded-docs vault (#556)
thxforall May 21, 2026
0cdc809
ci(github): sync issue and project status automation (#555)
thxforall May 21, 2026
b025975
revert: ci: dispatch git events to decoded-docs vault (#557)
thxforall May 21, 2026
1cf5c3c
docs(spec): mark telegram-bot-vault-integration as deprecated/superse…
thxforall May 21, 2026
e2a1329
docs(harness): docs/agent + skills/commands 정합성 점검 (audit #558) (#559)
thxforall May 21, 2026
baabae4
chore(harness): audit #558 후속 정리 — gitignore + plans + command 등록 (#560)
thxforall May 21, 2026
a0c2e9d
feat(content-studio): unified pipeline v2 — LLM generation, post pick…
thxforall May 21, 2026
6595cc7
chore(harness): entry SSOT + ownership-matrix + folder unify (#561 PR…
thxforall May 21, 2026
e3f7da0
fix(modal): prevent maximize/close buttons from overlapping social ac…
thxforall May 21, 2026
2f4f058
chore(harness): recover #561 PR-B/PR-C commits that missed dev (#567)
thxforall May 21, 2026
376e08c
feat(harness): expand vault-dispatch to issues/comments/reviews/relea…
thxforall May 21, 2026
19e5205
chore(harness): #561 follow-up — archive misc docs + qa-screenshots q…
thxforall May 21, 2026
b92ce2c
feat(design-system): map magazine palette to colors tokens (#573)
thxforall May 21, 2026
cbf503c
feat(web): add GA4 integration for Rakuten affiliate traffic (#568) (…
CIOI May 23, 2026
9bd0402
chore(ops): mount Instaloader session volume for prod ai-server (#575…
CIOI May 24, 2026
b00dd1c
fix(admin): serve instagram enrichment list via api-server (#578) (#583)
CIOI May 25, 2026
cd4315b
chore: merge main into dev (sync backend release manifests)
CIOI May 25, 2026
cd24b4a
feat(cody): Stage 1 + Stage 2 pipeline + verify UI (ADR-0005 Phase A-…
cocoyoon May 27, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
171 changes: 171 additions & 0 deletions .planning/cody-stage1-eval.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,171 @@
# Cody Stage 1 — Eval Rubric (ADR-0005, S1-cocoyoon-7)

> PoC (`scripts/cody_describe_poc.py`) 결과 검증 rubric.
> 결과 파일: `.planning/probes/cody-describe-YYYY-MM-DD.json` (PoC 출력).
> Hand-labeled set: 50–100건 — raf 큐레이트 (인스타·핀터레스트 코디 스크린샷).

## 평가 차원

각 입력당 raf 가 6개 차원으로 5점 척도 채점. JSON 결과를 보면서 직접 평가.

| 차원 | 정의 | 5점 기준 | 1점 기준 |
|---|---|---|---|
| **mood** (정성) | `style.mood` phrase 가 룩의 무드를 표현하는가 | 사람이 표현했을 phrase 2–4개, 중복 없음 | 일반어 ("nice", "casual") 만 / 룩과 무관 |
| **silhouette** (정성) | `style.silhouette` 가 옷의 형태 특징을 잡는가 | "cropped top + wide pants + chunky shoes" 같이 구체 | "outfit" "clothing" 같은 일반어 |
| **color_palette** (hit-rate) | 실제 두드러진 색 3–6개 중 몇 개 적중 | ≥ 80% hit | < 30% hit |
| **items 분해** (precision/recall) | items 가 사진의 실제 visible 아이템과 매칭 | 누락 ≤ 1, false positive 0 | 누락 다수 또는 hallucinate |
| **items.category** (accuracy) | controlled list 매핑 정확 | 100% (controlled list 사용 + 정확) | < 50% / 자유 슬러그 남발 |
| **detected_text** (recall) | 워터마크·캡션·브랜드 텍스트 OCR | 사진 내 모든 readable text 포함 | 누락 또는 hallucinated text |

추가 측정:
- **P95 latency** — 호출당 latency. 목표 < 15초.
- **분석당 cost** — Gemini usage 메타데이터 기반. 목표 < $0.005 (Flash) / < $0.02 (Pro).
- **JSON parse failure rate** — schema mismatch 비율. 목표 < 5%.

## 진입 기준 (Phase 3 wire-in)

다음 모두 만족 시 Phase 3 service-layer 통합으로 진행:

| 차원 | 임계 |
|---|---|
| mood 평균 | ≥ 3.5 / 5 |
| silhouette 평균 | ≥ 3.5 / 5 |
| color_palette hit-rate | ≥ 60% |
| items precision | ≥ 0.7 |
| items.category accuracy | ≥ 0.7 |
| detected_text recall | ≥ 0.5 |
| P95 latency | ≤ 15s |
| JSON parse failure | ≤ 5% |

## 모델 선택 (Flash vs Pro)

1차: Gemini 2.5 Flash. 위 진입 기준 통과 시 그대로 lock-in.
미달 차원이 mood / silhouette / items 중 2개 이상이면 `GEMINI_MODEL=gemini-2.5-pro` 로 재실행 → 같은 셋에서 비교.
cost / latency / 정확도 trade-off 측정 후 ADR-0005 D3 의 lock-in 결정 갱신.

## 결과 기록

PoC 실행 후 다음 형식으로 `.planning/probes/cody-describe-eval-YYYY-MM-DD.md` 작성:

```markdown
# Eval YYYY-MM-DD (model=gemini-2.5-flash, N=50)

## Summary

- mood avg: 3.8 / 5
- silhouette avg: 3.4 / 5
- color_palette hit-rate: 0.72
- items precision: 0.78
- items.category accuracy: 0.81
- detected_text recall: 0.62
- P95 latency: 11.2s
- avg cost / call: $0.0021
- JSON parse failure: 1/50 (2%)

## Notable failures

- image_id=poc_017: 배경의 그림을 옷으로 오인 → items[3] hallucinate
- image_id=poc_032: 한글 워터마크 OCR 실패
- ...

## Decision

- mood / silhouette / items precision 모두 임계 충족 → Phase 3 진입
- detected_text recall 0.62 — Flash 한계 — Pro 비교 필요 여부 별 issue
```

## 관련 문서

- [[/decoded-docs/Architecture/cody-engine.md]] §3.1 — Stage 1 LLM 후보 + Flash 실측 scorecard
- [[/decoded-docs/Architecture/adr/ADR-0005-cody-stage1-pipeline-extension.md]] D3 — 모델 선택 결정 + 실측
- [[/decoded-docs/Project/open-questions.md]] Q1.1 — VLM 선택 close 조건
- [[/decoded-docs/Project/sprints/2026-S1.md]] S1-cocoyoon-7 DoD

---

## 첫 batch 측정 — 2026-05-26 (Gemini 2.5 Flash, N=10)

### 입력

- prod assets DB `raw_post_sources WHERE source_type='pin'` random 10건
- 입력 파일: `.planning/probes/cody-stage1-pins10.txt`
- 결과 파일: `.planning/probes/cody-describe-batch10.json`

### 운영 메트릭 (실측)

| 메트릭 | 값 |
|---|---|
| 성공률 | 8/10 (실패 2건 = R2 403, 빈 URL — 인프라/데이터 이슈, prompt 영향 0) |
| Batch total cost | **$0.001789** |
| Avg cost / call | **$0.000224** |
| 100건 projection | $0.0224 |
| 1k 분석/일 → 월 cost | ~$6.7 |
| 10k 분석/일 (MVP scale) → 월 | ~$67 |
| Avg prompt tokens (image+prompt) | 1,250 |
| Avg completion tokens (JSON) | 432 |
| Cached tokens | 0 |
| Latency avg | 11.59s |
| Latency P50 | 9.95s |
| **Latency P95** | **21.41s** (목표 15s 미달 1.4×) |
| Latency max | 21.41s |
| Schema valid | 8/8 (Pydantic round-trip 통과) |
| Brand hallucinate | 0/8 |
| Category controlled-list 준수 | 8/8 |

### Quality scorecard

| 차원 | 평가 | 코멘트 |
|---|---|---|
| Category (controlled list 분류) | A+ | 8/8 정확, list 밖 0 |
| Mood (phrase 추상화) | A | fashion editorial 톤, distinguishable phrase ("bohemian fantasy", "preppy feminine cute") |
| Silhouette (옷 형태) | A- | phrase 형, fashion 어휘 익숙 ("cropped jacket", "wide leg pants") |
| Color palette (룩 전체) | B+ | lowercase english 정합 |
| Importance (high/mid/low) | A | 직관과 일치, Stage 2 시그널 충분 |
| Brand (시각적 추출) | — | 1/8 (Alo), 나머지 null — replaceable 이라 무관 |
| OCR (catalog text 풍부) | A- | Bpink Styles 12/12 perfect / 텍스트 빈 이미지에선 hallucinate 가능 (Lil Yachty whatsonthestar 케이스, scope 밖) |
| Item fit | C+ | ~50% null (특히 footwear / accessory) |
| Item color (멀티색 / 패턴) | C | 단일 슬러그만, "white with red graphic" 표현 약함 |
| 소재 / fabric (satin/linen/wool 등) | D | "leather"/"denim" 외 거의 부재 |
| Detail attribute (lace/pleated/embellished) | D | 거의 부재 |

### 진입 기준 (cody-stage1-eval.md "진입 기준" 섹션) vs 실측

| 차원 | 임계 | 실측 | 통과 |
|---|---|---|---|
| mood 평균 | ≥ 3.5/5 | A 등급 (정성) | ✅ |
| silhouette 평균 | ≥ 3.5/5 | A- 등급 | ✅ |
| color_palette hit-rate | ≥ 60% | (정성 평가, hand-label 없이 sample 측정) | ✅ |
| items precision | ≥ 0.7 | 8/8 valid + brand hallucinate 0 | ✅ |
| items.category accuracy | ≥ 0.7 | 8/8 controlled list | ✅ |
| detected_text recall | ≥ 0.5 | catalog 풍부 시 perfect / 빈 이미지 hallucinate risk | 🟡 |
| P95 latency | ≤ 15s | **21.41s** | ❌ |
| JSON parse failure | ≤ 5% | 0/10 | ✅ |

→ **6/8 차원 통과.** P95 latency + detected_text recall 의 일관성 두 가지가 follow-up.

### 도메인 분포 인지

10건 sample 의 분포가 cody thesis input 가정과 다양함:

| Pin | 도메인 | catalog panel | 옷 정보 | cody fit |
|---|---|---|---|---|
| 1 | 판타지 일러스트 (mannequin) | - | flat | 부분 |
| 2 | BTS 군집 사진 (#Bangtan) | - | multi-person | 낮음 |
| 3 | (R2 403 — fail) | - | - | - |
| 4 | preppy 코디 일러스트 | - | clear | 높음 |
| 5 | 얼굴 가린 무대 의상 | - | abstract | 낮음 |
| 6 | Jisoo 솔로 사진 | - | clear | 높음 |
| 7 | (빈 URL — fail) | - | - | - |
| 8 | Jennie crop top 솔로 | - | clear | 높음 |
| 9 | Alexandra Saint Mleux athletic | - | clear | 높음 |
| 10 | IU elegant editorial | - | clear | 높음 |

→ 처음 Bpink Styles 케이스 (composite + catalog panel) 이외 10/10 *catalog panel 없는 상태*. prod 의 pinterest fashion pin 분포가 cody thesis 와 가까운 *catalog 없는 셀럽 / 캐릭터 사진* 위주.

### Follow-up actions

1. **P95 latency 완화** — `services/metadata/utils/image_compression.py` PoC 미적용 상태. composite_bytes ≤ 1MB 압축 후 재측정. (S2 Phase 7)
2. **`gemini_pricing` / `gemini_usage_events` 테이블 dev DB seed** — cost_tracking DB 적재 warning 제거.
3. **인스타 OOTD 진짜 cody domain sample 측정** — 현재 측정은 pinterest fan-account pin (군집·캐릭터·무대의상 혼재). 사용자 hand-curate OOTD URL 5–10건 추가 측정.
4. **Pro 비교 spike** — `cody-engine.md §3.1.3` Pro spike trigger 도달 시 같은 8건 셋에 Pro 호출. fit null 비율 / 소재 / 디테일 attribute 깊이 변화 측정.
5. **입력 도메인 사전 필터** — 일러스트 / 군집 / 무대의상 reject criteria 를 prompt 또는 별 classifier 로 추가 (cody MVP 의 추천 noise 방지).
20 changes: 17 additions & 3 deletions docs/database/entity-enrichment-pipeline.md
Original file line number Diff line number Diff line change
Expand Up @@ -215,8 +215,10 @@ Recommended implementation home:
- candidate extraction and profile/Gemini enrichment: ai-server scheduler or
worker, because `instagram.py`, Instaloader, Gemini usage, and existing
scheduler patterns already live there;
- admin control and review: Next.js admin APIs/pages under `packages/web`;
- operation writes: service-role server-side only, with audit/provenance.
- admin control and review: Next.js admin pages; **operation DB reads via api-server**
(`GET /api/v1/admin/entity-enrichment/instagram-accounts`, #578) — not Vercel
`DATABASE_SERVICE_ROLE_KEY`;
- operation writes: service-role on api-server / ai-server only, with audit/provenance.

## Admin Dashboard Scope

Expand Down Expand Up @@ -336,7 +338,19 @@ We want to use Gemini 2.5 Flash with Google Search grounding to enrich missing `
- First release should process both new tagged accounts and existing operation `instagram_accounts` that are missing names/images/review state.
- Reuse the existing R2 public URL strategy for profile/logo images; do not add Supabase Storage for this flow.
- Review operation RLS posture before exposing more audit/admin data. Current MCP advisory reports RLS disabled on `seaql_migrations`, `admin_audit_log`, and `post_magazine_events`.
```

## Program follow-up (text → FK-first)

Ship scope for the pipeline itself is [#495](https://github.com/decodedcorp/decoded/issues/495) (closed, PR #544). Broader **entity ID migration** (app/search/write path) is tracked on GitHub only — not a separate file under `docs/database/`.

| Item | GitHub |
| ---- | ------ |
| Program epic | [#580](https://github.com/decodedcorp/decoded/issues/580) |
| S1 audit & S2 plan | [#581](https://github.com/decodedcorp/decoded/issues/581) |
| Admin list api-server proxy | [#578](https://github.com/decodedcorp/decoded/issues/578) |
| Backfill queue PoC | [#582](https://github.com/decodedcorp/decoded/issues/582) |

Sprint task ↔ issue mapping lives in vault `2026-S1` (not monorepo). S2 waves (Meilisearch, UI resolver, write path) are out of scope for this RFC — see epic #580.

## Documentation Checklist

Expand Down
1 change: 1 addition & 0 deletions docs/database/operating-model.md
Original file line number Diff line number Diff line change
Expand Up @@ -173,6 +173,7 @@ unset PRD_DB_URL
| Supabase CLI 사용법 (link, push, gen types) | [`docs/database/04-supabase-cli-setup.md`](04-supabase-cli-setup.md) |
| nightly drift CI 운영 (#373) | [`docs/database/drift-check.md`](drift-check.md) |
| Entity enrichment RFC | [`docs/database/entity-enrichment-pipeline.md`](entity-enrichment-pipeline.md) |
| Entity ID migration program (issues only) | GitHub [#580](https://github.com/decodedcorp/decoded/issues/580) (+ #581, #582); sprint mapping in vault `2026-S1` |
| PRD → dev 시드 자동화 스크립트 | [`scripts/seed-from-prod.sh`](../../scripts/seed-from-prod.sh) |
| assets 프로젝트 설계 (#333) | [`docs/architecture/assets-project.md`](../architecture/assets-project.md) |
| agent 짧은 요약 | [`docs/agent/database-summary.md`](../agent/database-summary.md) |
Expand Down
Loading
Loading