Skip to content

Add docker flow#262

Draft
yangm2 wants to merge 4 commits intocodeforpdx:mainfrom
yangm2:add-docker-flow
Draft

Add docker flow#262
yangm2 wants to merge 4 commits intocodeforpdx:mainfrom
yangm2:add-docker-flow

Conversation

@yangm2
Copy link
Contributor

@yangm2 yangm2 commented Feb 11, 2026

What type of PR is this? (check all applicable)

  • Refactor
  • Feature
  • Bug Fix
  • Optimization
  • Documentation Update
  • Infrastructure
  • Maintenance

Description

This PR adds docker and apple/container containers to the development flow. This also opens up the way to publish docker containers on ghcr.io and deploying those containers to staging and production.

Note: Building the backend-dev image took >15 minutes(!) on my Macbook Pro w/ M2 CPU. And the resulting image (TYPE=dev) is ~350MB. We may want to look into streamlining the dev dependencies.

Disclosure: The code was initially generated with Claude coding agent 🤖.

Related Tickets & Documents

QA Instructions, Screenshots, Recordings

With the frontend container, I was able to run and pass the regression tests. Without the container, the frontend tests fail on my Macbook Pro (presumably due to a misconfigured npm development environment ... i.e. old version of npm?).

  • with apple/containers

    • frontend
      • run lint
      • run tests
    • backend
      • run make check
      • Known Issue: mounting the google default application credential json results in a directory instead of the json file. This needs to be resolved for local integration testing.
    • local integration testing
  • with docker/compose

    • frontend
      • run lint
      • run tests
    • backend
      • run make check
    • local integration testing

Added/updated tests?

  • Yes
  • No, and this is why: development environment changes
  • I need help with writing tests

Documentation

  • If this PR changes the system architecture, Architecture.md has been updated

[optional] Are there any post deployment tasks we need to perform?

github-actions bot and others added 4 commits February 7, 2026 21:12
Implements issue codeforpdx#253 with complete Docker support:
- Multi-stage Dockerfiles for backend and frontend (dev/deploy types)
- docker-compose.yml for local development with hot reload
- docker-compose.prod.yml for production deployment
- Documentation updates in README, Architecture, and CLAUDE files

Key features:
- Dev images include all tooling, tests, and hot reload
- Deploy images are minimal with production-only dependencies
- Support for Mac, Linux, and Windows development environments

Note: GitHub Actions workflow file (.github/workflows/docker-build.yml)
needs to be added separately by a maintainer with workflow permissions.

Co-authored-by: yangm2 <yangm2@users.noreply.github.com>
@yangm2 yangm2 self-assigned this Feb 11, 2026
@yangm2 yangm2 added enhancement New feature or request Not ready infrastructure Pull requests related to infrastructure and underlying workflows labels Feb 11, 2026
@github-actions
Copy link
Contributor

github-actions bot commented Feb 15, 2026

Claude finished @yangm2's task in 2m 20s —— View job


Code Review Complete ✅

Tasks

  • Review backend Dockerfile
  • Review frontend Dockerfile
  • Review docker-compose configurations
  • Review documentation updates
  • Check security considerations
  • Review .dockerignore files
  • Assess performance concerns
  • Provide comprehensive feedback

Summary

This PR provides a solid foundation for Docker-based development and deployment. The implementation follows many best practices, but there are opportunities for improvement in security, performance, and production readiness.

🎉 Strengths

  1. Well-structured multi-stage builds - Separate dev/deploy stages with appropriate tooling
  2. Comprehensive documentation - Excellent additions to CLAUDE.md, README.md, and Architecture.md
  3. Proper .dockerignore files - Prevents unnecessary files in build context
  4. Security-conscious - Uses read-only mounts, non-root considerations, security headers
  5. Good separation of concerns - Dev and prod compose files appropriately configured

🔒 Security Concerns (Priority)

  1. Backend Dockerfile:18 - Piping curl to shell for uv installation is a security risk
  2. docker-compose.prod.yml:11 - FLASK_SECRET_KEY should use Docker secrets instead of env vars
  3. .env.docker.example:16 - Unsafe default could accidentally reach production
  4. frontend/Dockerfile:54 - Missing Content Security Policy headers
  5. docker-compose.yml:16-17 - Credentials mounting issue needs resolution (documented in PR)

⚡ Performance Issues

  1. Backend build time - 15+ minutes is excessive. Key optimizations:

    • Use --frozen flag on uv sync (lines 30, 53)
    • Consider multi-stage build to exclude build-essential from final image
    • Could reduce 350MB image size significantly
  2. Frontend Dockerfile:61 - Missing nginx optimizations (gzip, caching)

  3. docker-compose.yml:28 - LangSmith tracing enabled by default could generate unexpected API costs

🏗️ Production Readiness

  1. Missing health checks - Both Dockerfiles should include HEALTHCHECK instructions
  2. Gunicorn configuration - Good but missing graceful shutdown timeout
  3. Nginx logging - Could be improved for better observability
  4. docker-compose.prod.yml - Assumes external reverse proxy (should document)

📋 Additional Observations

Code Quality: ✅ Clean, well-commented Dockerfiles with clear stage separation

Testing: ⚠️ PR notes QA incomplete for docker/compose workflow. The incomplete QA items should be addressed before merge:

  • docker/compose frontend lint/tests
  • docker/compose backend checks
  • Local integration testing

Documentation: ✅ Excellent additions. Consider adding:

  • Troubleshooting section for credentials mounting
  • Service health verification steps
  • Performance comparison notes
  • Multi-arch build details

🎯 Recommendations

Must address before merge:

  1. Fix uv installation security issue (backend/Dockerfile:18)
  2. Add --frozen flags to uv sync commands
  3. Complete QA testing checklist
  4. Resolve credentials mounting issue

Should address:

  1. Implement Docker secrets for sensitive data
  2. Add health checks to both services
  3. Add CSP headers to frontend
  4. Optimize build time and image size

Nice to have:

  1. Add troubleshooting documentation
  2. Implement multi-stage optimization for backend
  3. Add nginx performance tuning
  4. Document reverse proxy requirements

Overall Assessment

This is a valuable addition that enables containerized development and deployment. The implementation is solid but needs security and performance improvements before production use. The ~15 minute build time and 350MB image size are concerning and should be investigated.

Recommendation: Address security issues and complete QA testing, then merge. Performance optimizations can be follow-up work.


@codeforpdx codeforpdx deleted a comment from github-actions bot Feb 15, 2026
&& rm -rf /var/lib/apt/lists/*

# Install uv
RUN curl -LsSf https://astral.sh/uv/install.sh | sh
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security: Avoid piping curl to shell

The current installation method curl -LsSf https://astral.sh/uv/install.sh | sh is a security antipattern. If the remote server is compromised or there's a MITM attack, arbitrary code could be executed during build.

Recommendation: Use the official Docker image or install from a verified package:

# Option 1: Copy from official uv image
COPY --from=ghcr.io/astral-sh/uv:latest /uv /usr/local/bin/uv

# Option 2: Install via pip (if available)
RUN pip install uv==0.5.21  # Pin specific version

FROM base AS dev

# Install all dependencies including dev group
RUN uv sync
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Performance: Missing --frozen flag

Running uv sync without --frozen can lead to lock file changes during build, which increases build time and creates non-reproducible builds.

Recommendation:

RUN uv sync --frozen

This ensures dependencies are installed exactly as specified in uv.lock without attempting to resolve or update.

FROM base AS deploy

# Install only production dependencies
RUN uv sync --no-dev
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Performance: Missing --frozen flag (deploy stage)

Same issue as dev stage - should use --frozen to prevent lock file resolution during build.

RUN uv sync --no-dev --frozen

Comment on lines +67 to +75
CMD ["uv", "run", "gunicorn", \
"--timeout", "300", \
"--capture-output", \
"--access-logfile", "-", \
"--error-logfile", "-", \
"--log-level", "info", \
"-w", "4", \
"-b", "0.0.0.0:5000", \
"tenantfirstaid.app:app"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Production Readiness: Missing health check endpoint

The Gunicorn configuration looks solid, but consider adding health check configuration and signal handling:

# Add healthcheck
HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \
  CMD curl -f http://localhost:5000/health || exit 1

# Improve graceful shutdown
CMD ["uv", "run", "gunicorn", \
     "--timeout", "300", \
     "--graceful-timeout", "30", \
     "--keep-alive", "5", \
     "--capture-output", \
     "--access-logfile", "-", \
     "--error-logfile", "-", \
     "--log-level", "info", \
     "-w", "4", \
     "-b", "0.0.0.0:5000", \
     "tenantfirstaid.app:app"]

This requires adding a /health endpoint to the Flask app.

# Install system dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
curl \
build-essential \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Performance: Consider multi-stage optimization

The current approach installs build-essential in the base stage but only needs it during dependency installation. Consider a multi-stage approach to reduce final image size:

# Build stage with compilers
FROM python:3.12-slim AS builder
RUN apt-get update && apt-get install -y --no-install-recommends \
    build-essential \
    && rm -rf /var/lib/apt/lists/*
# ... install dependencies here

# Runtime stage without build tools
FROM python:3.12-slim AS base
RUN apt-get update && apt-get install -y --no-install-recommends \
    curl \
    && rm -rf /var/lib/apt/lists/*
COPY --from=builder /root/.local /root/.local
# ... copy installed packages

This could significantly reduce the 350MB image size mentioned in the PR.

try_files $uri /index.html;
}

# Security headers
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security: Add CSP and additional hardening headers

Good security headers present, but consider adding Content Security Policy and additional hardening:

# Content Security Policy
add_header Content-Security-Policy "default-src 'self'; script-src 'self'; style-src 'self' 'unsafe-inline'; img-src 'self' data:; font-src 'self'; connect-src 'self' http://localhost:5000;" always;

# Additional security headers
add_header Referrer-Policy "strict-origin-when-cross-origin" always;
add_header Permissions-Policy "geolocation=(), microphone=(), camera=()" always;

Note: You may need to adjust CSP based on actual frontend requirements (external fonts, CDN assets, etc.).

}
EOF

EXPOSE 80
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Production Readiness: Add healthcheck and proper logging

Consider adding nginx healthcheck and improving logging configuration:

HEALTHCHECK --interval=30s --timeout=3s \
  CMD wget --quiet --tries=1 --spider http://localhost/health || exit 1

# Configure nginx for better logging
RUN cat > /etc/nginx/nginx.conf <<'EOF'
user nginx;
worker_processes auto;
error_log /var/log/nginx/error.log warn;
pid /var/run/nginx.pid;

events {
    worker_connections 1024;
}

http {
    include /etc/nginx/mime.types;
    default_type application/octet-stream;
    
    log_format main '$remote_addr - $remote_user [$time_local] "$request" '
                    '$status $body_bytes_sent "$http_referer" '
                    '"$http_user_agent" "$http_x_forwarded_for"';
    
    access_log /var/log/nginx/access.log main;
    sendfile on;
    keepalive_timeout 65;
    gzip on;
    
    include /etc/nginx/conf.d/*.conf;
}
EOF

Comment on lines +16 to +17
# Mount Google credentials from host
- ${GOOGLE_APPLICATION_CREDENTIALS_HOST}:/app/secrets/google-creds.json:ro
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security: Credentials mounting issue

You noted in the PR description that mounting Google credentials results in a directory instead of a file. This is a known Docker/macOS issue.

Root Cause: When the host path doesn't exist or is incorrectly specified, Docker creates it as a directory.

Solutions:

  1. Verify the path exists before running docker-compose:

    test -f "${GOOGLE_APPLICATION_CREDENTIALS_HOST}" || echo "Credentials file not found!"
  2. Use a more robust mounting approach:

    volumes:
      # Mount entire .config/gcloud directory
      - ~/.config/gcloud:/root/.config/gcloud:ro
    environment:
      - GOOGLE_APPLICATION_CREDENTIALS=/root/.config/gcloud/application_default_credentials.json
  3. For production, use Docker secrets instead of bind mounts (already done in docker-compose.prod.yml).

- "5000:5000"
volumes:
# Mount source for hot reload
- ./backend/tenantfirstaid:/app/tenantfirstaid:ro
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Development Experience: Read-only volumes limit functionality

The :ro (read-only) flag on source volumes prevents some development workflows:

  • Can't run formatters/linters that modify files in place
  • Can't generate coverage reports
  • Can't create new test files from within container

Recommendation: Remove :ro for development volumes or make it configurable:

volumes:
  - ./backend/tenantfirstaid:/app/tenantfirstaid  # writable for dev tools
  - ./backend/tests:/app/tests
  - ./backend/scripts:/app/scripts

For security-sensitive mounts (credentials), keep :ro.

- VERTEX_AI_DATASTORE=${VERTEX_AI_DATASTORE}
- MODEL_NAME=${MODEL_NAME:-gemini-2.5-pro}
- LOG_LEVEL=${LOG_LEVEL:-DEBUG}
- LANGSMITH_API_KEY=${LANGSMITH_API_KEY}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Configuration: Verify LangSmith defaults

LangSmith tracing is enabled by default in the dev environment. This could:

  1. Generate significant API usage/costs if developers aren't aware
  2. Send data to external service during local dev
  3. Cause issues if LANGSMITH_API_KEY is not set

Recommendation:

- LANGSMITH_TRACING=${LANGSMITH_TRACING:-false}  # Opt-in instead of opt-out
- LANGCHAIN_TRACING_V2=${LANGCHAIN_TRACING_V2:-false}

Update .env.docker.example to document when developers should enable this.

Comment on lines +6 to +7
ports:
- "127.0.0.1:5000:5000"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Production Configuration: Bind to localhost only

Good security practice binding to 127.0.0.1 (prevents direct external access), but this assumes:

  1. A reverse proxy (nginx, traefik) will handle external traffic
  2. The proxy runs on the same host or has network access

Recommendation: Document this requirement in Architecture.md and consider adding the reverse proxy to the compose file:

services:
  nginx-proxy:
    image: nginx:alpine
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - ./nginx.conf:/etc/nginx/nginx.conf:ro
    depends_on:
      - frontend
      - backend

environment:
- ENV=prod
- FLASK_ENV=production
- FLASK_SECRET_KEY=${FLASK_SECRET_KEY}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security: FLASK_SECRET_KEY in environment

Passing FLASK_SECRET_KEY via environment variable is less secure than using Docker secrets for production.

Recommendation:

secrets:
  flask_secret:
    external: true

services:
  backend:
    secrets:
      - flask_secret
    environment:
      - FLASK_SECRET_KEY_FILE=/run/secrets/flask_secret

Then update Flask app to read from file:

secret_key_file = os.getenv('FLASK_SECRET_KEY_FILE')
if secret_key_file:
    with open(secret_key_file) as f:
        app.config['SECRET_KEY'] = f.read().strip()

LOG_LEVEL=DEBUG

# Flask configuration
FLASK_SECRET_KEY=dev-secret-key-unsafe
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security: Unsafe default for production

The example includes FLASK_SECRET_KEY=dev-secret-key-unsafe which could accidentally be used in production.

Recommendation:

# Flask configuration
# WARNING: Generate a secure random key for production!
# python -c 'import secrets; print(secrets.token_hex(32))'
FLASK_SECRET_KEY=dev-secret-key-CHANGE-THIS-IN-PRODUCTION

Consider adding a startup check in the Flask app that errors if the secret key matches known insecure defaults.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request infrastructure Pull requests related to infrastructure and underlying workflows Not ready

Projects

None yet

Development

Successfully merging this pull request may close these issues.

improve local testing and deployments with docker containers

1 participant