-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Port collision detection and fallback mechanism needed
Problem
Users with existing services running on ports 80/443 (Apache, Nginx, etc.) experience silent failures when running obol stack up. The k3d load balancer fails to bind to these ports, but the error is not surfaced clearly to users.
Evidence: User reported seeing Apache2 default page at http://obol.stack instead of the expected Obol frontend, indicating Apache was already bound to port 80.
Current Port Configuration
internal/embed/k3d-config.yaml:12-24:
ports:
- port: 80:80 # Direct binding (conflicts with Apache/Nginx)
nodeFilters:
- loadbalancer
- port: 8080:80 # Alternative binding (works)
nodeFilters:
- loadbalancer
- port: 443:443 # Direct binding (potential conflicts)
nodeFilters:
- loadbalancer
- port: 8443:443 # Alternative binding (works)
nodeFilters:
- loadbalancerCurrent behavior:
- Binds to BOTH
80:80and8080:80 - If port 80 is taken, k3d may fail silently or partially
- No pre-flight check to detect port conflicts
- No clear error message guiding users
Proposed Solutions
Option 1: Pre-flight Port Check (Recommended)
Add port availability check in obol stack up before creating cluster:
Check logic:
- Test if ports 80 and 443 are available
- If occupied, provide clear guidance:
[!] Port 80 is already in use by another service Please stop the conflicting service or access obol-stack via: http://obol.stack:8080 https://obol.stack:8443 To identify the service using port 80: sudo lsof -i :80 # or sudo netstat -tulpn | grep :80
Implementation:
- Add
checkPortAvailability()function ininternal/stack/stack.go - Use
net.Listen()to test port binding (cross-platform) - Warn users with actionable steps before cluster creation
Option 2: Port Fallback Mechanism
Make port binding configurable with smart defaults:
Config options:
# In k3d-config.yaml or via environment variables
OBOL_HTTP_PORT=80 # Default, fallback to 8080 if occupied
OBOL_HTTPS_PORT=443 # Default, fallback to 8443 if occupiedBehavior:
- Try to bind to port 80
- If fails, automatically use 8080 instead
- Log clear message about which ports are active
- Update
/etc/hostsguidance accordingly
Option 3: Remove Direct Port Bindings (Breaking Change)
Only use non-privileged ports by default:
ports:
- port: 8080:80 # HTTP only via 8080
nodeFilters:
- loadbalancer
- port: 8443:443 # HTTPS only via 8443
nodeFilters:
- loadbalancerAdvantages:
- No conflicts with system services
- No sudo required for port binding on some systems
- Simpler mental model
Disadvantages:
- Breaking change - users must use
http://obol.stack:8080 - Less convenient than standard port 80
Recommended Approach
Combination of Option 1 + Option 3:
-
Remove direct port 80/443 bindings (breaking change for v1.0)
- Only bind
8080:80and8443:443 - Update documentation to use
http://obol.stack:8080 - Simpler, more reliable
- Only bind
-
Add pre-flight port check for 8080/8443
- Detect conflicts before cluster creation
- Provide clear error messages with
lsof/netstatcommands - Guide users to stop conflicting services
-
Update all documentation and examples
- README.md: Use
:8080and:8443in examples - CLAUDE.md: Update architecture docs
- Default application ingresses: Update host examples
- README.md: Use
Alternative: Document Current Behavior
If we keep current port configuration:
-
Add troubleshooting section to README.md:
## Troubleshooting ### Port Conflicts If you see unexpected content at http://obol.stack (e.g., Apache default page): 1. Check what's using port 80: ```bash sudo lsof -i :80
-
Stop the conflicting service:
sudo systemctl stop apache2 # Ubuntu/Debian sudo brew services stop httpd # macOS with Homebrew
-
Or access via alternative port:
http://obol.stack:8080 https://obol.stack:8443
-
-
Add port check to
obolup.shpost-install guidance
Files Requiring Changes
internal/embed/k3d-config.yaml- Port configurationinternal/stack/stack.go- AddcheckPortAvailability()functionobolup.sh- Add port conflict guidanceREADME.md- Update URLs and add troubleshootingCLAUDE.md- Update architecture documentation- Default application values - Update ingress host examples
Related Issues
This is related to the general UX principle of "fail fast with clear errors" rather than silent failures.