This document outlines security considerations, threat model, and best practices for deploying pyproc in production environments.
- Python Worker Processes - Execution environment for user code
- Unix Domain Sockets - Communication channel
- Go Application - Process manager and API gateway
- User Data - Data processed by Python functions
| Threat | Impact | Likelihood | Mitigation |
|---|---|---|---|
| Arbitrary code execution in Python | Critical | Medium | Input validation, sandboxing |
| Socket hijacking | High | Low | File permissions, access control |
| Resource exhaustion | Medium | Medium | Resource limits, monitoring |
| Information disclosure | High | Low | Proper error handling, logging |
| Denial of Service | Medium | Medium | Rate limiting, backpressure |
βββββββββββββββββββββββββββββββββββββββββββ
β Go Process (Supervisor) β
β β
β - Manages worker lifecycle β
β - Enforces resource limits β
β - Implements access control β
βββββββββββββββββββββββββββββββββββββββββββ
β
Process Boundary
β
βββββββββββββββββββββββββββββββββββββββββββ
β Python Workers (Isolated) β
β β
β - Separate process space β
β - Limited privileges β
β - No direct system access β
βββββββββββββββββββββββββββββββββββββββββββ
Unix Domain Sockets provide:
- No network exposure - Local only communication
- Filesystem permissions - OS-level access control
- Process authentication - UID/GID verification
# Create dedicated user
useradd -r -s /bin/false pyproc
# Set ownership
chown pyproc:pyproc /var/run/pyproc
# Run workers as unprivileged user
su -s /bin/bash pyproc -c "python worker.py"// Restrictive socket permissions
cfg := pyproc.SocketConfig{
Dir: "/var/run/pyproc",
Permissions: 0600, // Owner read/write only
}Always validate input in Python workers:
@expose
def process_data(req):
# Validate input types
if not isinstance(req.get("data"), list):
raise ValueError("Invalid input: data must be a list")
# Validate input size
if len(req["data"]) > MAX_INPUT_SIZE:
raise ValueError("Input too large")
# Sanitize input values
data = [sanitize(item) for item in req["data"]]
return process_safe(data)# Kubernetes
resources:
limits:
memory: "1Gi"
# Docker
docker run --memory="1g" --memory-swap="1g" myapp# Kubernetes
resources:
limits:
cpu: "1000m"
# Docker
docker run --cpus="1.0" myappimport resource
# Set limits in Python worker
resource.setrlimit(resource.RLIMIT_NOFILE, (1024, 1024))
resource.setrlimit(resource.RLIMIT_NPROC, (100, 100))# requirements.txt - Pin versions
numpy==1.21.0
pandas==1.3.0
scikit-learn==0.24.2
# Regular updates
pip install --upgrade pip
pip install -r requirements.txtNever expose internal details in errors:
@expose
def secure_function(req):
try:
# Process request
return process(req)
except InternalError as e:
# Log full error internally
logger.error(f"Internal error: {e}")
# Return generic error to client
raise ValueError("Processing failed")FROM python:3.11-slim
RUN useradd -r pyproc
USER pyproc
# Drop all capabilities
RUN setcap -r /usr/local/bin/python3.11{
"defaultAction": "SCMP_ACT_ERRNO",
"architectures": ["SCMP_ARCH_X86_64"],
"syscalls": [
{
"names": ["read", "write", "open", "close"],
"action": "SCMP_ACT_ALLOW"
}
]
}# AppArmor profile
profile pyproc_worker {
# Allow reading worker script
/app/worker.py r,
# Allow socket access
/var/run/pyproc/* rw,
# Deny network access
deny network,
# Deny raw socket
deny capability net_raw,
}
Monitor these security-relevant metrics:
- Failed authentication attempts
- Unusual resource consumption
- Error rate spikes
- Process crashes
- Socket connection failures
// Log security events
logger.Info("worker_started",
"worker_id", workerID,
"user", os.Getuid(),
"socket", socketPath,
)
logger.Warn("invalid_request",
"method", req.Method,
"error", err,
"source", conn.RemoteAddr(),
)- Monitor logs for anomalies
- Set up alerts for security events
- Track resource usage patterns
- Review audit logs regularly
- Isolate - Stop affected workers
- Investigate - Analyze logs and memory dumps
- Remediate - Patch vulnerabilities
- Recovery - Restart with fixes
- Review - Post-mortem analysis
- Input validation in all exposed functions
- Error messages don't leak sensitive info
- Dependencies are pinned and verified
- Code reviewed for security issues
- Static analysis tools run (bandit, gosec)
- Running as non-root user
- Socket permissions set to 0600
- Resource limits configured
- Sandboxing enabled (containers/seccomp)
- Monitoring and alerting configured
- Regular dependency updates
- Security patches applied promptly
- Audit logs reviewed
- Incident response plan tested
- Backup and recovery procedures
Risk: Executing arbitrary Python code
Mitigation:
# NEVER do this
exec(req["code"]) # DANGEROUS!
# Instead, use predefined functions
ALLOWED_FUNCTIONS = {"predict", "process"}
if req["method"] in ALLOWED_FUNCTIONS:
result = ALLOWED_FUNCTIONS[req["method"]](req["body"])Risk: Accessing files outside intended directory
Mitigation:
import os
def safe_path(base, user_path):
# Resolve to absolute path
path = os.path.join(base, user_path)
real_path = os.path.realpath(path)
# Ensure it's within base directory
if not real_path.startswith(os.path.realpath(base)):
raise ValueError("Invalid path")
return real_pathRisk: DoS through resource consumption
Mitigation:
import signal
import resource
# Set timeout
signal.alarm(30) # 30 second timeout
# Limit memory
resource.setrlimit(resource.RLIMIT_AS, (1024*1024*1024, -1)) # 1GB- Implement encryption at rest for sensitive data
- Use TLS for any network communication
- Follow data retention policies
- Implement audit trails
- Implement RBAC for multi-tenant scenarios
- Log all access attempts
- Regular access reviews
- Principle of least privilege
Stay informed about security updates:
- Watch the pyproc repository
- Subscribe to security advisories
- Regular dependency scanning
- Vulnerability assessments
If you discover a security vulnerability:
- Do NOT open a public issue
- Email security details to: security@example.com
- Include:
- Description of the vulnerability
- Steps to reproduce
- Potential impact
- Suggested fix (if any)
We aim to respond within 48 hours and provide a fix within 7 days for critical issues.