Skip to content

Latest commit

 

History

History
671 lines (564 loc) · 49.3 KB

File metadata and controls

671 lines (564 loc) · 49.3 KB

DEVLOG - AI Job Application Agent

This document tracks notable implementation milestones and technical decisions.

Historical note:

  • earlier entries reflect the product and architecture assumptions at the time they were written
  • later entries supersede earlier history when the product direction changed, especially around LinkedIn import, workflow history, session-level quota UX, and persistence structure

Day 1: Project Setup and Resume Parsing

  • Initialized the repository, virtual environment, license, and Streamlit shell.
  • Added an MVP navigation flow covering resume upload, LinkedIn import, job search, and manual JD input.
  • Chose lightweight parsing dependencies first:
    • pypdf
    • python-docx

Day 2: Demo Inputs and Unified File Parsing

  • Added sample resumes and sample job descriptions under static/.
  • Updated parsing code so both uploaded files and local demo files work through the same logic.
  • Added basic job-description cleaning and simple extraction for title, location, experience, and skills.

Day 3: LinkedIn Import and Session Persistence

  • Added LinkedIn data-export ZIP ingestion instead of direct LinkedIn API access.
  • Parsed summary, education, skills, preferences, publications, and position history where present.
  • Stored parsed payloads in st.session_state so the UI survives navigation and reruns.

Day 4: Repo Structure Alignment With GitHub Agent

  • Moved the active application logic into src/.
  • Refactored app.py into a cleaner Streamlit entrypoint with section-level render functions.
  • Added parser-focused tests under tests/.
  • Added docs/architecture.md, ADR files under docs/adr/, a roadmap, and a real README.
  • Improved parsing behavior:
    • TXT resumes are now supported
    • job-description cleanup preserves line breaks
    • JD source persistence now stores parsed text instead of the raw uploaded file object

Day 5: Modular UI and Defensive Parser Refactor

  • Reduced root app.py to a thin entrypoint and moved UI composition into src/ui/.
  • Split the codebase into clearer layers:
    • src/parsers/ for raw ingestion and extraction
    • src/services/ for normalization and deterministic workflow helpers
    • src/ui/ for Streamlit theme, components, navigation, and pages
  • Kept top-level parser modules as compatibility wrappers so existing imports and tests continue to work.
  • Added ResumeDocument to the shared schemas and started using typed objects more consistently in the UI.
  • Hardened parser behavior with more defensive checks:
    • explicit empty-file handling
    • clearer unsupported-format failures
    • safer PDF and DOCX open failures
    • better LinkedIn ZIP validation and normalization
  • Verified the refactor with:
    • uv run pytest
    • uv run python -m compileall app.py src tests

Day 6: Deterministic Fit Analysis Foundation

  • Expanded the typed schema layer with:
    • FitAnalysis
    • TailoredResumeDraft
    • richer CandidateProfile source signals
  • Added shared keyword taxonomy in src/taxonomy.py so resume and JD matching use the same vocabulary.
  • Improved profile normalization in src/services/profile_service.py:
    • basic name and location inference from resumes
    • keyword extraction from resume text
    • merge logic for resume and LinkedIn candidate sources
    • candidate-context text assembly for downstream analysis
  • Improved job normalization in src/services/job_service.py:
    • empty-input validation
    • must-have and nice-to-have signal extraction
    • cleaner requirement deduplication
  • Added new deterministic workflow services:
    • src/services/fit_service.py
    • src/services/tailoring_service.py
  • Updated the Streamlit JD page to render:
    • merged candidate readiness
    • fit score and gap analysis
    • first-pass tailored resume guidance
  • Extended test coverage for:
    • profile normalization
    • job normalization
    • fit scoring
    • tailoring output
  • Verified the new workflow layer with:
    • uv run pytest
    • uv run python -m compileall app.py src tests

Day 7: Supervised Agent Workflow Layer

  • Added the first supervised multi-agent stack under src/agents/:
    • profile_agent.py
    • job_agent.py
    • fit_agent.py
    • tailoring_agent.py
    • review_agent.py
    • orchestrator.py
  • Added src/prompts.py for centralized grounded prompt construction.
  • Added src/openai_service.py as a thin OpenAI wrapper with JSON-response validation and typed failure handling.
  • Expanded schemas with typed agent outputs and AgentWorkflowResult.
  • Kept the system defensive:
    • the agent workflow only runs when the user explicitly clicks a button
    • OpenAI usage is optional
    • if model execution is unavailable or fails, orchestration falls back to deterministic output
  • Updated the JD page so it can now:
    • run supervised orchestration on demand
    • cache workflow results against the current candidate/JD signature
    • render profile positioning, fit narrative, tailoring output, and review notes
  • Added orchestrator tests covering:
    • deterministic fallback mode
    • successful AI-assisted mode with a fake service
    • graceful fallback when AI execution fails
  • Verified the agent layer with:
    • uv run pytest
    • uv run python -m compileall app.py src tests

Day 8: Report Builder and Export Layer

  • Added src/report_builder.py to assemble a deterministic application package from:
    • candidate profile
    • job description
    • fit analysis
    • tailored draft
    • optional supervised agent output
  • Added src/exporters.py for initial package export handling.
  • Updated the JD page to:
    • render an application-package preview
    • expose Markdown download
    • automatically upgrade the package when agent output is available
  • Updated top-level UI copy so the app now reflects package/export readiness.
  • Added tests covering:
    • report construction
    • export byte formatting
  • Verified the report/export layer with:
    • uv run pytest
    • uv run python -m compileall app.py src tests

Day 9: Playwright-First PDF Export

  • Upgraded the export layer to support polished PDF output.
  • Chose the same pattern used in the GitHub agent:
    • Playwright/Chromium as the primary PDF renderer
    • ReportLab as the fallback backend
  • Updated the JD page so users can:
    • prepare a PDF package explicitly
    • download a polished PDF once it is generated
  • Kept Markdown export as the editable output format for users who want to make manual changes before sharing.
  • Added export tests covering:
    • HTML report generation
    • ReportLab fallback when the Playwright backend fails
    • typed failure handling when both PDF backends fail
  • Installed and validated local PDF dependencies, including Chromium for Playwright.
  • Verified PDF export behavior with:
    • uv run pytest
    • uv run python -m compileall app.py src tests
    • direct Playwright and fallback PDF smoke checks

Day 10: Codebase Hardening and CI

  • Fixed the pytest configuration so tests can resolve src imports:
    • added [tool.pytest.ini_options] with pythonpath = ["."] to pyproject.toml
  • Expanded the skill taxonomy in src/taxonomy.py:
    • hard skills went from 20 to ~140 entries covering programming languages, data/ML, frameworks, databases, cloud, web/API, and DevOps
    • soft skills went from 10 to 30 entries
  • Extracted shared utility functions into src/utils.py:
    • dedupe_strings and match_keywords were duplicated across four service files and the JD parser
    • replaced all copies with imports from the shared module
  • Removed Streamlit coupling from the parser layer:
    • removed @st.cache_data decorators and import streamlit as st from src/parsers/resume.py and src/parsers/jd.py
    • parsers are now pure functions that work in any context (FastAPI, CLI, tests)
  • Updated the GitHub Actions CI workflow:
    • scoped triggers to the main branch
    • switched to the official astral-sh/setup-uv action
    • added a python -m compileall check step
    • enabled verbose test output
  • Removed obsolete requirements.txt and requirements-dev.txt export files since pyproject.toml and uv.lock are the dependency source of truth.
  • Normalized all DEVLOG verification commands from venv\Scripts\python.exe to uv run.
  • Verified the hardened codebase with:
    • uv run pytest
    • uv run python -m compileall app.py src tests

Day 11: Scope Tightening Around Resume + JD Workflow

  • Removed LinkedIn import from the active product and codebase.
  • Simplified candidate-profile handling so the working profile comes directly from resume parsing.
  • Deleted LinkedIn parser modules and their test coverage.
  • Updated UI navigation and copy to reflect the narrower, lower-friction intake flow.
  • Added an ADR documenting why LinkedIn export ingestion was removed from the product scope.
  • Verified the removal pass with targeted search, compile checks, and focused tests.

Day 12: Review-Driven Iteration, Strategy, and Observability

  • Added a bounded review-revision loop in the orchestrator so rejected tailoring output is revised before finalizing the workflow result.
  • Added the StrategyAgent and integrated it into the supervised workflow.
  • Added structured JSON logging for workflow and OpenAI request lifecycle events.
  • Added session-level OpenAI usage tracking and budget guards.
  • Refactored Streamlit state access behind src/ui/state.py and moved UI workflow orchestration into src/ui/workflow.py.
  • Verified the architecture pass with:
    • uv run pytest
    • uv run python -m compileall app.py src tests

Day 13: Tailored Resume Artifact and Export Expansion

  • Added a dedicated ResumeGenerationAgent after review in the supervised pipeline.
  • Added src/resume_builder.py to build a direct-use tailored resume artifact from grounded workflow state.
  • Added resume themes:
    • classic_ats
    • modern_professional
  • Extended export support so both report and tailored resume can be exported as Markdown and PDF.
  • Added a combined ZIP export bundle for the resume and report together.
  • Added resume diff support in src/resume_diff.py and exposed original-vs-tailored comparison in the UI.
  • Verified the new artifact and export flow with:
    • uv run pytest
    • uv run python -m compileall app.py src tests

Day 14: Grounded Assistant, Model Routing, and Responses API Migration

  • Added a shared two-mode assistant panel with:
    • Using the App
    • About My Resume
  • Implemented the assistant as one service with explicit grounded modes instead of creating more orchestrator agents.
  • Added per-task model routing so high-trust tasks can use stronger models while lower-risk tasks stay on cheaper tiers.
  • Migrated the OpenAI wrapper from Chat Completions to the Responses API.
  • Extended usage tracking to retain per-model totals internally while keeping only session-capacity messaging in the UI.
  • Added the model sizing and routing reference in docs/model-latency-and-cost-estimates.md.
  • Added Google sign-in architecture planning in docs/google-signin-implementation-plan.md.
  • Added ADRs for:
    • the two-mode assistant decision
    • Google sign-in via Supabase as the persistent identity direction
  • Verified the current integrated state with:
    • uv run pytest
    • successful commit and push to origin/main

Day 15: Google Sign-In Foundation

  • Added src/auth_service.py as a Supabase-backed auth wrapper for:
    • Google OAuth start
    • auth-code exchange
    • session restore
    • sign-out
  • Added Supabase auth configuration in src/config.py and example environment variables for local setup.
  • Extended src/ui/state.py with authenticated user, token, and auth-error state helpers.
  • Bootstrapped auth callback handling and session restoration in the Streamlit app shell.
  • Added a sidebar account panel for sign-in and sign-out.
  • Gated the AI-assisted workflow behind authenticated account state while keeping deterministic resume and JD flows available without login.
  • Added focused auth tests and verified the integration with:
    • uv run pytest

Persistent per-user usage storage, saved artifact history, and quotas are intentionally left for later Supabase-backed phases.

Day 16: Persistent App User Record

  • Added src/user_store.py to sync a lightweight app_users record after Google sign-in and on session restore.
  • Added AppUserRecord to the shared schema layer for plan-tier and account-status state.
  • Extended config and environment examples for the app_users table name and default account metadata.
  • Updated the sidebar account panel to surface persisted plan and account status when the sync succeeds.
  • Kept login resilient: auth still works even if the Supabase table or RLS policy is not ready yet.

Day 17: External Usage Persistence on Supabase Postgres

  • Added src/usage_store.py to persist assisted usage events in Supabase Postgres for authenticated users.
  • Extended src/openai_service.py with an optional usage-event callback so persistence stays transport-agnostic.
  • Wired authenticated usage-event recording from src/ui/workflow.py without leaking Streamlit concerns into the service layer.

Day 18: Deterministic JD Parsing Re-baseline

  • Reverted the experimental resume/JD parser-verifier agent layer and returned intake parsing to the deterministic path.
  • Kept the deterministic JD parsing improvements that were useful on their own:
    • broader extraction for Required Experience: style phrasing
    • filtering Location: lines out of preferred / nice-to-have buckets
    • real fixture coverage for JD PDF and DOCX samples already stored under static/demo_job_description/
  • Verified the rollback plus retained parser improvements with:
    • uv run pytest tests/test_profile_service.py tests/test_job_service.py tests/test_jd_parser.py tests/test_resume_parser.py tests/test_orchestrator.py

Day 18 (parallel track): Single-Pass Review-Correction Workflow

  • Removed the live ProfileAgent and JobAgent stages from the supervised workflow because they were mostly restating deterministic inputs without adding enough value for the latency cost.
  • Simplified the active orchestrator path to:
    • fit
    • tailoring
    • strategy
    • review
    • resume generation
  • Removed the bounded rerun loop that previously sent tailoring and strategy back through another full pass after review.
  • Changed Review so it can return direct corrections for tailoring and strategy, and the orchestrator now feeds those corrected outputs straight into final resume generation.
  • Removed interview-theme style outputs that were adding contract size without being core to the current product output.
  • Updated the UI, payload layer, and report rendering so they reflect the smaller workflow and the direct-correction review model.
  • Verified the redesign with focused workflow, prompt, builder, and UI test coverage.

Day 19: Model Routing And Output Budget Tuning

  • Rebalanced reasoning effort by task based on real runtime logs instead of keeping one default posture for every agent.
  • Changed the active routing defaults to:
    • fit: gpt-5-mini-2025-08-07 with low reasoning
    • tailoring: gpt-5-mini-2025-08-07 with medium reasoning
    • strategy: gpt-5-mini-2025-08-07 with low reasoning
    • review: gpt-5.4 with medium reasoning
    • resume_generation: gpt-5.4 with medium reasoning
  • Increased the Review output budget to start at 4000 tokens so the stage does not immediately fall into retry-on-truncation for corrected JSON payloads.
  • Reduced oversized output caps where observed usage made the previous limits unnecessary:
    • fit: 1600
    • strategy: 1500
    • resume_generation: 3000
  • Kept tailoring at 3200 and review at 4000 because they still carry the heaviest grounded payloads in the current flow.
  • Verified the new routing and cap defaults with targeted orchestration and OpenAI-service tests.

Day 20: Review Approval Semantics And Backward Compatibility

  • Clarified Review semantics so approved now means the final corrected output is safe to use, not that the incoming tailoring or strategy draft was perfect before correction.
  • Added unresolved_issues to the review contract so the app can distinguish between:
    • issues found in the incoming draft
    • blockers that still remain after correction
  • Updated UI and report labels to show Approved After Corrections when Review repaired the output successfully.
  • Added backward-compatible access patterns so older saved or in-memory ReviewAgentOutput objects without unresolved_issues do not crash the app.
  • Logged PDF-output quality as a follow-up documentation item because export aesthetics still need a dedicated pass even though workflow runtime is now much healthier.
  • Added the usage_events SQL schema and RLS policies in docs/supabase-usage-events.sql.
  • Kept assisted requests resilient: usage persistence failures are logged but do not break the user-facing AI response.

Day 18 (parallel track 2): Daily Quotas From Persisted Usage

  • Added src/quota_service.py to compute per-user daily assisted limits from persisted usage_events.
  • Extended src/usage_store.py with daily usage aggregation for the current UTC day.
  • Wired quota checks into src/openai_service.py as a preflight hook so assisted requests stop cleanly when the daily cap is exhausted.
  • Updated the JD workflow UI to show daily remaining assisted capacity alongside the existing session-level view.
  • Added plan-tier daily quota configuration through environment variables for free and paid tiers.

Day 19 (parallel track 2): Workflow History and Artifact Metadata

  • Added src/history_store.py to persist authenticated workflow runs and artifact metadata in Supabase Postgres.
  • Wired supervised workflow completion to create workflow_runs records.
  • Wired export preparation to create artifacts records for generated PDFs and ZIP bundles.
  • Added recent workflow and artifact history to the sidebar account panel.
  • Added Supabase schema and RLS setup in docs/supabase-workflow-history.sql.

Day 20 (parallel track 2): History Page and Supabase Bootstrap

  • Added a dedicated History page in the Streamlit navigation.
  • Centralized authenticated history refresh so sign-in and session restore load the same recent workflow and artifact state.
  • Added docs/supabase-bootstrap.sql as a one-shot setup path for app_users, usage_events, workflow_runs, and artifacts.
  • Updated README setup guidance to reflect the working Supabase-backed auth, quota, and history path.

Day 21: Saved-Run Regeneration and History-State Separation

  • Extended workflow_runs to persist saved reconstruction payloads:
    • workflow_signature - workflow_snapshot_json
    • report_payload_json
    • tailored_resume_payload_json
  • Added historical regeneration helpers so saved reports, tailored resumes, PDFs, and ZIP bundles can be rebuilt from persisted payloads without re-running OpenAI.
  • Separated history selection state from the active current workflow run so new exports do not attach to an older historical run by mistake.
  • Added additive Supabase migration support in docs/supabase-workflow-history-payloads-migration.sql.
  • Verified the history-regeneration path with focused tests and a passing full suite.

Day 22: Documentation Re-Baselining

  • Rewrote the architecture, strategy, roadmap, and README narrative so the published repo matches the implemented product.
  • Documented the current operating model more clearly:
    • Streamlit-first UI shell
    • supervised specialist-agent workflow
    • Supabase-backed auth, quotas, and history
    • saved-payload historical regeneration instead of blob storage
  • Cleaned up stale product-copy references that still described persistence and history as future work.

Day 23: Payload Versioning and History UX Tightening

  • Wrapped new saved workflow payloads in a versioned JSON envelope while keeping the historical reader backward-compatible with the earlier unversioned payload format.
  • Added compatibility inspection so unsupported or malformed saved payloads fail visibly in the History page instead of silently producing incorrect downloads.
  • Clarified quota UX by separating account-level daily quota messaging from browser-session safeguards.
  • Made the History page more explicit that browsing old runs is read-only and does not retarget new exports away from the current active workflow run.

Day 24: Pre-Deployment Hardening and Hosting Decision

  • Split the remaining large UI workflow and page boundaries behind stable facades while preserving the public entrypoints.
  • Extracted duplicated builder helpers into shared utilities and centralized UI-side AuthService access.
  • Cleaned the remaining pre-launch hygiene items:
    • removed unused helper code
    • consolidated duplicate string-list normalization logic
    • replaced datetime.utcnow() with a timezone-aware UTC clock path
    • documented the ReportLab md5 compatibility patch
  • Expanded boundary coverage for fit, job, tailoring, strategy, and logging modules.
  • Added .streamlit/config.toml and reworked deployment docs so the app can be deployed before Supabase is provisioned.
  • Chose Streamlit Community Cloud as the first deployment target while keeping the existing Playwright/Chromium-first PDF path and retaining ReportLab as the runtime fallback.

Day 25: Supabase Auth Stabilization and Local Operator Setup

  • Added repo-root .env loading for local development while preserving hosted secret-manager compatibility through os.getenv(...).
  • Added docs/supabase-setup-checklist.md as the canonical fresh-project operator guide for Supabase setup.
  • Stabilized Supabase Google sign-in for the Streamlit rerun model by preserving PKCE verifier state across the OAuth redirect and callback exchange.
  • Fixed the sidebar navigation handoff so JD-page transitions no longer mutate current_menu after the radio widget is instantiated.
  • Removed stale fresh-install guidance that still pointed at the earlier workflow-history migration file.
  • Verified the auth and navigation changes with focused tests and a passing full suite.

Day 26: OpenAI Runtime Hardening and Reasoning Routing

  • Diagnosed assisted-workflow fallback to a GPT-5 compatibility issue in the Responses API path: routed models were rejecting temperature.
  • Updated src/openai_service.py to retry without temperature when the routed model rejects that parameter.
  • Added a retry path for incomplete Responses API outputs caused by exhausted max_output_tokens.
  • Increased the OpenAI client timeout and enabled SDK retries to reduce transient read operation timed out failures in the assistant path.
  • Added per-task GPT-5 reasoning routing:
    • medium effort for normal workflow tasks
    • high effort for review, resume generation, and grounded application-QA tasks
  • Extended .env.example and config helpers so reasoning effort can be tuned without editing code.
  • Verified the stabilized OpenAI path with targeted service tests, live local probes, and a passing full suite.

Day 27: Saved Workspace Retention Hardening

  • Removed the legacy workflow_runs and artifacts persistence path so the product now stores only one latest saved_workspaces snapshot per user.
  • Simplified runtime config, state, exports, and tests around the latest-only saved workspace model.
  • Updated the Supabase bootstrap so expired saved workspaces become unreadable exactly at expires_at through RLS.
  • Added a Supabase scheduled cleanup job that deletes expired saved-workspace rows every 5 minutes, even if the user never returns.
  • Kept the app-side save/load purge as a backup cleanup path in case the scheduled job is temporarily unavailable.

Day 28: Saved Workspace UX Simplification And Doc Re-Baseline

  • Removed the dead dedicated saved-workspace page and its unused test path because the product now restores the latest saved snapshot only through the sidebar Reload Workspace action.
  • Removed history-only helper paths that were left behind from the earlier workflow_runs era.
  • Updated the assistant and retrieved product knowledge so they describe the current reload flow accurately and no longer mention the removed page or the old live ProfileAgent / JobAgent path.
  • Re-baselined the active docs around the real shipped state:
    • login-first resume intake
    • no separate history tab
    • one latest reloadable saved workspace per user
    • current Render + Docker + Supabase deployment path
  • Removed broken README/checklist references to docs that are no longer in the repo.

Day 29: Render Auth Stabilization And Saved Usage Refresh

  • Stabilized Google sign-in on Render across the Supabase callback flow.
  • Fixed PKCE callback persistence issues caused by the hosted Streamlit redirect/runtime model.
  • Fixed the sign-in button navigation regression after the callback hardening changes.
  • Added a server-side fallback path so the auth code exchange no longer depends only on a returned custom query parameter.
  • Fixed same-session quota refresh behavior after successful assisted workflow runs so the sidebar reflects updated account state without requiring a fresh login.

Day 30: Login-Required AI Features And Quota Simplification

  • Made the in-app assistant login-required in the active product.
  • Simplified quota UX so the product now presents account-level daily quota as the main user-facing assisted limit.
  • Removed browser-session assisted budget from the live UI and current product copy.
  • Re-aligned assistant fallback behavior, product knowledge, prompt guidance, and README language around the authenticated quota model.

Day 31: Public Repo Cleanup And README Rebuild

  • Rebuilt the public README around the live product:
    • badges
    • Render app link
    • screenshot story
    • sample rendered PDF links
  • Added the screenshot and rendered-PDF assets to tracked repo content for GitHub presentation.
  • Moved tracked demo resume PDFs under static/demo_resume/ so demo assets live with the app inputs instead of under old PDF-template docs.
  • Re-baselined the roadmap around:
    • finishing the job-application product
    • hardening the current Render-hosted Streamlit stack
    • later FastAPI + Docker backend extraction

Day 32: FastAPI Job Backend Foundation

  • Added an in-repo FastAPI backend skeleton under backend/ with:
    • /api/health
    • /api/jobs/search
    • /api/jobs/resolve
  • Added shared job-search schemas and provider boundaries for backend-owned job discovery.
  • Wired the first real provider path through Greenhouse board/job resolution.
  • Added deterministic Greenhouse normalization and imported-job review rendering in the JD flow.

Day 33: Multi-Provider Search And JD Review Expansion

  • Added Lever as provider #2 behind the same adapter contract as Greenhouse.
  • Turned Job Search into a real backend-powered search surface instead of a placeholder.
  • Added recent-first search ordering and stronger role-family matching for technical roles.
  • Extended the JD review panel so manual and imported JD flows both render readable summaries before analysis.
  • Persisted imported job metadata inside the saved-workspace snapshot so Reload Workspace restores the full imported-job context.

Day 34: Saved Jobs Shortlist And Search UX Polish

  • Added a Supabase-backed saved_jobs persistence layer for shortlisted jobs.
  • Added save/remove actions directly on job-search result cards for authenticated users.
  • Added a Saved Jobs panel on the Job Search page so shortlisted roles can be revisited and loaded back into the JD workflow later.
  • Added deterministic in-card job preview rendering so users can inspect skills, compensation, location, and structured summary before import.
  • Polished result-card clarity around remote/location signals and saved-state visibility.

Day 35: Assistant Session Memory And Latency Reduction

  • Kept one visible in-app assistant chat while splitting the internal task routing between:
    • lighter product-help questions
    • stronger grounded application-QA questions
  • Reduced assistant prompt weight by replacing oversized workflow payloads with a compact package-context summary for application questions.
  • Added short-lived assistant session memory on top of the OpenAI Responses API:
    • prewarm assistant context when the panel opens
    • store the latest response_id
    • reuse that conversation state for follow-up questions during the same session
  • Added assistant session signatures so the app clears stale assistant memory automatically when the relevant workflow context changes.
  • Kept the behavior defensive:
    • single chat UX remains unchanged
    • deterministic fallback still works when assisted execution is unavailable
    • clearing chat also clears the short-lived assistant session memory
  • Verified the assistant pass with focused assistant-service and assistant-panel tests.

Day 36: Next.js + FastAPI Re-Baseline

  • Completed the architecture migration from the old Streamlit runtime to the live Next.js + FastAPI split stack.
  • Removed the retired Streamlit shell, deployment files, and Streamlit-only tests from the active repo.
  • Moved the product onto the Vercel frontend plus VPS backend deployment shape.
  • Reworked the workspace UI around the real product flow:
    • upload profile
    • search job
    • review job description
    • run the workflow
  • Simplified the visible outputs so the workspace now centers on:
    • tailored resume
    • cover letter
  • Removed the visible application report from the workspace.
  • Removed the strategy stage from the active agentic workflow.
  • Simplified resume export to one standard ATS-friendly format and removed the old modern resume theme path from the active backend and frontend.
  • Re-baselined the README and architecture docs so they reflect the shipped Vercel + FastAPI product rather than the earlier Streamlit stages.

Day 37: Workspace "Workbench" Redesign

  • Rebuilt the workspace UI as Direction B "Workbench":
    • top bar with brand + ⌘K command-palette trigger + account popover
    • four-step rail (Resume → Job Search → Job Detail → Analysis) with explicit gating and done/active visual states
    • hero band with dynamic per-tab title, sub, and status pill
    • vertical canvas of regions instead of the previous left-sidebar split
    • floating assistant FAB replacing the side-mounted assistant column
  • Added a ⌘K command palette overlay for fast navigation between workspace surfaces.
  • Pulled all three resume-builder review steps into auto-growing textareas inside collapsible sections so editing long responses no longer scrolls inside a tiny box.
  • Reset the resume-builder intake mode on commit and added name-pending fallbacks so the first auto-summary doesn't crash when the LLM hasn't extracted a full name yet.
  • Lifted workspace base font sizes and chip / button metrics for readability on standard 1080p displays.
  • Capped canvas width and restructured the analysis pipeline grid so the workspace stays comfortable on wide screens; collapsed the resume intake card on parse so the next step gets the focus.
  • Tightened the assistant FAB surface (deeper black-blue background + full-width replies) so the chat panel stops fighting the underlying canvas.
  • Polished the "honest hero" copy, vertical skills/experience layout, friendlier pipeline labels, empty-state hints, and a next-step pulse.
  • Fixed two regressions caught during the redesign:
    • workflow-completion notice no longer leaves the analysis card stuck on "Running"
    • landing "Sign out" button no longer flips to "Signing out…" when the user clicks "Enter workspace"

Day 38: Atmospheric Polish, Mobile Responsive Pass, And Parser Quality Lift

  • De-boxified the editorial document treatment for the Draft profile + JD body: tighter type pairing, removed the per-section card chrome, treated the canvas like one document.
  • Atmospheric polish across the workspace — page grain texture, layered surfaces, richer hover and focus states.
  • Motion + delight pass: per-region entrance animations, subtle micro-interactions, count-up animations on quota and saved-job stats.
  • Replaced the four-button rail with a unified pill nav that ships a progress connector and per-step lock-reason tooltips.
  • Single 540px-breakpoint mobile responsive pass covering the topbar, hero, rail, regions, account popover, intake mode toggle, pipeline cards, and chip wrap. Brand text reflows correctly on narrow screens.
  • Resume rendering robustness:
    • drop empty resume sections cleanly (Experience can drop for student / early-career profiles, only Certifications drops in the standard case)
    • added Projects + Publications as first-class resume sections; un-clipped page-2 overflow on the PDF render
    • matched the resume PDF + parsed-view typography to the cover-letter family
    • added a switchable professional_neutral theme alongside classic_ats (Georgia body, neutral grays — pure black/white aesthetic for editorial-leaning profiles)
  • Resume parser quality lift:
    • hardened the deterministic resume parser; routed TXT through the LLM hybrid
    • expanded the parser-quality test set from 6 → 15 fixtures across unseen formats
    • simplified the hybrid to a pure LLM source-of-truth with a full deterministic fallback when the LLM is unavailable or schema-fails
    • deterministic polish lifted average from 0.81 → 0.92 across the 15 fixtures
  • Added a Tier-2 renderer-fidelity quality runner; fixed a double-escape on experience meta lines along the way.
  • JD parser quality lift:
    • Tier-1 baseline: 15 fixtures, deterministic 0.78 average
    • LLM JD parser: 0.78 → 0.99 across the same fixture set
  • Added skill canonicalization so Postgres / PostgreSQL synonyms collapse during fit matching — stops the false-negative skill gaps users were seeing.
  • Workflow narrowing:
    • removed FitAgent, the application-package report, and the bundle endpoint from the active workflow
    • TailoringAgent battle-test: 0.99 average across 6 (resume, JD) pairs
    • ReviewAgent battle-test: 1.00 LLM, 0.69 deterministic across 6 scenarios
  • Per-profile resume section ordering: students lead with Education, academics with Publications, seniors with Experience after Skills. Drives both the HTML and the PDF templates.
  • Fixed a resume-builder review-progress bug where a re-uploaded basics block over-captured roles + dropped review progress.

Day 39: DOCX Export, Conversational Resume Builder, And Cached-Jobs Foundation

  • Workspace auth gate: signed-out visitors hitting /workspace are now redirected to the landing page; cross-origin host strip mirrors the existing app-subdomain middleware (no new env var).
  • Resume-builder durability:
    • 7-day TTL on resume_builder_sessions with active-user refresh, mirroring the saved_workspaces TTL pattern; cron + RLS expires-at filter both wired
    • tri-state persistence indicator (saved / skipped / unauthenticated) in the field-completeness rail so the user knows whether their progress will survive a reload
  • DOCX-first artifact export pipeline (six phases):
    • Phase 1: python-docx-based exporter for the classic_ats theme; mirrors the existing structured PDF render (header, summary, skills, experience, projects, education, publications, certifications) and honours artifact.section_order
    • Phase 2: artifact-export route now dispatches on pdf | docx; the markdown branch is removed from backend/services/artifact_export_service.py
    • Phase 3: frontend cleanup sweep — removed every Markdown export button, hook, and type; download buttons now offer PDF and DOCX side-by-side
    • Phase 4: professional_neutral DOCX theme with a shared palette resolver across PDF and DOCX so both themes read consistently in Word, Google Docs, and the PDF preview
    • Phase 5: POST /workspace/resume-builder/export synthesizes a TailoredResumeArtifact from the builder session's draft profile (no JD, empty target_role, section_order from compute_section_order(candidate_profile)); auth-gated like the other resume-builder routes
    • Phase 6: download row UI under "Generate base resume" — theme picker + Download PDF / Download DOCX
  • Conversational LLM resume builder:
    • shipped the 14-item punch list (DB migrations, lazy-load, thread-bound state, all three battle tests, adversarial coverage, signature hash, dead-code cleanup)
    • end-to-end LLM chat: 5/8 fields extracted in one turn, backtracking works, 100% completion on the smoke fixture; "Generate base resume" produces clean DOCX/PDF
    • workspace chat-bubble experiment shipped + reverted; transcript style retained as the chosen direction
  • Resume-builder content quality:
    • LLM-first structuring pass with a deterministic regex fallback, plus header alignment so the rendered name matches the structured schema
    • skills are bucketed into named categories (Languages & Tools, ML/DL Frameworks, etc.) so the rendered resume groups skills by family instead of a flat pipe-separated list
    • structuring output cached across exports + persistence so re-rendering doesn't re-run the LLM
    • recovers a full name when the LLM intake drops a surname mid-conversation
    • thin one-liner summaries get expanded to full paragraphs by the structuring pass; bumped the structuring model + token budget for the expanded contract
    • Projects + Publications sections rendered through the same Draft profile / DOCX / PDF path as Experience
    • Tier-3 quality runner for the resume-builder structuring pass
  • ResumeGenerationAgent battle-test: LLM 1.00, deterministic 0.94 across 6 (resume, JD) pairs.
  • CoverLetterAgent battle-test: LLM 0.97, deterministic 0.95 across 6 pairs.
  • Cached-jobs foundation (Phases 2 + 3 of the seven-phase plan):
    • Phase 2: cached_jobs Supabase table + refresh_cached_jobs worker; POST /admin/refresh-cache endpoint protected by a constant-time bearer compare. Worker bulk-upserts every Greenhouse + Lever posting and runs the smart cleanup (tombstone if a user has saved the listing, hard-delete otherwise) per source — only sources whose refresh actually succeeded are eligible for cleanup.
    • Phase 3: /jobs/search defaults to the cached path through JobSearchService.search_cached(...); ?live=true keeps a live-fan-out escape hatch for diagnostics. Surfaces cache: ok | not_configured | error in source_status so monitoring can see when the cache misses.

Day 40: Multi-ATS Coverage, Postgres-RPC Ranked Search, And Dropdown Filters

  • Phase 4: bumped the source pool to ~117 Greenhouse boards + 30 Lever sites and validated the slug list against the live APIs. First refresh after deploy hits the cache rather than every user paying the live fan-out cost.
  • Phase 5: enabled pg_net in the Supabase project + documented the cron schedule that POSTs to /admin/refresh-cache every ~30 min (committed under docs/job_cache_cron_setup.sql). Frontend gets an "Expired" badge on saved-job cards whose listings the cleanup pass has tombstoned.
  • Phase 5b: relevance-ranked cache search via a new Supabase RPC (search_cached_jobs_ranked):
    • PostgREST's text_search() chain returns a terminating builder that doesn't compose with .order(), so a single round-trip ranked search needs a function. The RPC owns the FTS + filter + sort logic and CachedJobsStore.search() calls it with a stable kwarg dict.
    • Warm cache: ~360 ms; cold: ~5.5 s; vs ~25 s for the live fan-out — the cache layer paid for itself on the first user query.
    • Post-flight fixes for cleanup eligibility and the report shape.
  • Phase 6: re-validated and expanded the source list — final Greenhouse pool of 79 verified boards + Ashby adapter (36 boards). Composite job IDs (source:tenant:job_id) avoid cross-tenant collisions when one company runs multiple Ashby boards.
  • Phase 7: Workday adapter for 11 Fortune 500 tenants (NVIDIA, Adobe, Walmart, Disney, HP, HPE, Boeing, Citi, Micron, BlackRock, Workday itself). Per-board page delay + reduced concurrency to stay under the anti-bot threshold; production cadence (one refresh per ~30 min) sits well below the rate limit. Fixed a status-reporting bug along the way: an all-failed provider used to land in the report as status: ok because the only path that set status away from ok assumed boards_succeeded > 0.
  • Phase 8: dropdown filters + sort for job search:
    • schema: work_mode and employment_type_norm GENERATED STORED columns on cached_jobs (with partial indexes on removed_at IS NULL); intern detection uses Postgres word-boundary regex (\mintern(s|ship|ships)?\M) so "Internal" / "International" don't false-match
    • RPC v2: extends search_cached_jobs_ranked with p_work_modes, p_employment_types, p_sort_by; ORDER BY branches on the sort key (relevancets_rank when there's a query else recency, newestposted_at DESC, oldestposted_at ASC, company_azLOWER(company))
    • Python plumbing: JobSearchQuery + JobSearchRequestModel + CachedJobsStore.search() extended with the new args; Pydantic validators normalize input + coerce unknown sort values to relevance
    • Frontend: replaced the lone "Remote only" checkbox with five dropdowns — Source / Work mode / Type (multi-select), Posted within (single-select, retained), Sort (single-select, new). Multi-select chips built on native <details>/<summary> for keyboard accessibility plus an extra mousedown outside-click + Escape dismiss handler so the popover behaves like a native menu.
    • Verified end-to-end against the live cache: filtering by Source = greenhouse + lever, Work mode = remote, Sort = company A → Z returned 12 alphabetically-sorted Pinterest-then-Affirm matches, all remote-friendly.
  • Total active cache after Day 40: ~11,877 jobs across four ATS providers.

Day 41: Landing Polish, Independent Step Navigation, Assistant State-Awareness, And Multi-Layer LLM Retry

Landing redesign — final polish pass

  • Workbench scroll narrative iteration: shrunk the sticky visual stage from a stretched 480 × 853 to a square 480 × 480 (aspect-ratio 1/1) with center-pinning so empty space inside the stage stops at ~60–100 px instead of the previous 300+ px.
  • Each of the four mock cards now mirrors the real workspace page rather than a generic data card:
    • Step 01 Resume: parsed-profile hero (Aria Patel · Staff ML Engineer · San Francisco) + 3-up stats grid (12 roles · 27 skills · 9 yrs) + skills chip cluster + filename pill with a green PARSED tag.
    • Step 02 Job Search: search bar with location, four filter chips, "47 MATCHES · BY RELEVANCE" header, three result cards with a gold "★ TOP MATCH" badge on the leader.
    • Step 03 JD: three big metric tiles (Match score 87%, Hard skills 12, Years 5+) with a blue-tinted accent on the match-score card, plus hard/soft skills chip rows.
    • Step 04 Analysis: four agent pipeline cards (Matchmaker ✓, Forge ✓, Gatekeeper running 62% with progress bar, Cover letter agent ○ standby).
  • Step text is now justify-content: center inside each 48vh block so step 01 reads at viewport center on first scroll-in, aligning with the centered visual stage.
  • Bento carousel tiles + workbench mock card surface dropped the previous blue corner-glow radial in favor of a flat rgba(0, 0, 0, 0.40) overlay that matches the workspace's .b-jd-block treatment — landing and workspace now read as one surface family.
  • Topbar consolidated to Workflow · Features · [Auth] — dropped the third GitHub link (already covered by the hero CTA + footer link).
  • Extracted the landing page into a design-system reference at frontend_redesign/redesign/landing/ (README + 5 specs covering chrome, hero, workbench, bento, final CTA) — peer to the existing workspace handoff/ so future passes have a same-shape context bundle.

Independent step navigation in the workspace

  • Removed the resume-parse gate on Job Search and Job Detail. A user can now paste a JD they're curious about before they have a resume, or browse listings without uploading anything. The "Upload a resume to unlock" tooltips on the rail are gone.
  • Only Analysis stays gated (it can't run without both inputs). The page-level "Upload a resume to proceed" affordance inside AnalysisRunner.tsx already enforces this honestly.
  • Cleaned up the now-dead "Upload a resume first" fallback sub text on the nav-jobs and nav-jd command-palette entries.

Assistant chat — ungated and state-aware

  • Removed the analysis-required gate that locked the assistant chat until a workspace had been analyzed. Users can now ask product-help questions ("how do I use this?", "what's step 03 for?") from the very first visit.
    • Three gates lifted in one pass: the panel's footer "Assistant unlocks after your first workspace run" lockup, the submitAssistantQuestion early-return + warning notice, and the assistantUnlocked prop on the command palette (now always true).
    • Renamed the cosmetic prop from requiresWorkspaceRunhasWorkspaceContext so the panel adapts copy (header sub, empty state, textarea placeholder) based on whether a workspace exists, not whether the chat is locked.
  • Added a WorkspaceStateContext projection that rides on every assistant query — current_step, has_resume, small resume_summary, has_jd, small jd_summary, has_analysis, saved_jobs_count, last_search_query. Counts only, no raw resume text. Backend's WorkspaceStateContextModel validates it; service layer folds it into the app_context dict that reaches AssistantService.
  • Added a 9-rule _WORKSPACE_STATE_GUIDANCE block to both the JSON-contract (build_assistant_prompt) and the streaming prose (build_assistant_text_prompt) system prompts so the LLM knows the shape of the new field, the step-number mapping (01=Resume, 02=Job Search, 03=Job Detail, 04=Analysis), the auth contract (signed-out users get redirected to landing — there's no "use feature X without signing in" answer), and the field semantics (e.g. experience_entries_count is the count of jobs held, NOT years).
  • Battle-tested across three personas (cold start / mid-flow / ready-to-run) over three rounds:
    • Round 1: 22/24 passes; surfaced two bugs (entry-count read as years, step-03 mismatch).
    • Round 2: 13/15 passes after the first two fixes; surfaced a product-knowledge gap (the "assistant builder" mode wasn't in the retrieval index) and a "yes you can analyze signed-out" mistake.
    • Round 3: 12/12 passes after refreshing src/product_knowledge.py to ground truth (12 documents covering auth, the 4-step flow, resume intake modes, all four ATS sources, supervised pipeline agents, exports, saved workspace, command palette, the assistant FAB, cover letter, quotas).
    • Combined: 47/51 (92%) with 0 outstanding correctness failures.

LLM resilience — three-layer retry stack + per-agent fallback isolation

The orchestrator's previous behavior was all-or-nothing: any single agent failure (after the SDK's built-in retries) cascaded to "downgrade the WHOLE pipeline to deterministic." A single bad packet during the Forge agent meant Gatekeeper, Builder, and Cover letter all ran deterministic too. Reworked the resilience layer:

  • Layer 1 (existing): OpenAI Python SDK retries up to 2 times on transient HTTP / 5xx / 429-with-Retry-After (we set max_retries=2 on the client).
  • Layer 2 (new): App-level retry on top of the SDK. After the SDK exhausts its 2, we try ONE more time on a tight allow-list — APIConnectionError, APITimeoutError, InternalServerError. NOT for 4xx / auth / persistent rate-limit / content-policy (deterministic problems). 400 ms delay between attempts. New openai_request_app_retry log event for production observability.
  • Layer 3 (new): Per-agent retry inside the orchestrator. If an agent's .run(...) raises AgentExecutionError (e.g. all OpenAI-call retries exhausted, or the response was semantically broken even after the existing budget retry), we wait 400 ms and retry that agent's full run once. Only fires in mode="openai"; no-op in deterministic.
  • Per-agent fallback isolation (new): When an agent's two LLM attempts both fail, the orchestrator runs that agent's deterministic fallback (via AgentClass(None).run(...)) for THAT agent only — downstream agents still try the LLM path. Forge failing no longer affects Gatekeeper.
    • Each call site now passes a deterministic_fallback_runner lambda alongside the assisted runner.
    • The whole-pipeline fallback is now a safety net that fires only if a per-agent deterministic fallback ITSELF errors out (very unusual — would mean our own deterministic code is broken).
    • Added a mode-reconciliation pass: if a pipeline started as mode="openai" but every agent ended up falling back per-agent (zero LLM successes), result.mode flips honestly to deterministic_fallback and the first LLM error's user_message becomes the fallback_reason.

Worst-case retry budget for a transient failure: SDK 2 + app 1 + per-agent 1 = up to 4 effective LLM attempts before an agent gives up. After that, that agent's deterministic fallback runs and the rest of the pipeline keeps using the LLM.

Coverage check: every responses.create call in the codebase routes through the new _create_response_with_app_retry helper now (run_json_prompt, run_text_stream, and the existing output-budget retry helper). By extension, the resume parser, JD parser, JD summary, all four workflow agents, AND the assistant chat all inherit the new retry layer for free.

Tests: 17 new resilience tests pin the contracts —

  • 9 in tests/test_openai_app_retry.py: retries on the 3 allow-listed types, does NOT retry on 4xx/auth, returns success after retry, raises on double-failure.
  • 8 in tests/test_orchestrator.py (5 existing + 3 new): per-agent retry recovers, per-agent fallback isolates a single failing agent, full-pipeline mode flips to deterministic when no agent succeeded with LLM.

ADRs added