Purpose: Redeploy YieldVault backend API service
RTO Target: 30 minutes
RPO Target: N/A (stateless application)
Last Updated: April 29, 2026
Last Tested:
Use this runbook when:
- Backend service is unresponsive
- Application errors or crashes
- Security patch deployment
- Configuration changes required
- Rollback to previous version needed
- Performance issues requiring restart
- SSH access to application servers
- Git repository access
- Docker registry access (if using containers)
- Environment variable access
- Load balancer/proxy access
-
gitclient -
nodeandnpm(v20+) -
docker(if using containers) -
systemctlor process manager (PM2) -
curlfor testing
- Target version/commit hash
- Environment (production/staging)
- Incident ticket number
- Rollback plan
- Verify backup exists (database, config)
- Notify team via Slack/PagerDuty
- Create incident ticket
- Check current version and document
- Review recent changes in git log
- Verify health of dependencies (database, RPC)
- Prepare rollback plan
- Schedule maintenance window (if planned)
# Check if service is running
systemctl status yieldvault-backend
# Or if using PM2
pm2 status yieldvault-backend
# Or if using Docker
docker ps | grep yieldvault-backendExpected Output:
● yieldvault-backend.service - YieldVault Backend API
Loaded: loaded (/etc/systemd/system/yieldvault-backend.service)
Active: active (running) since Mon 2026-04-29 10:00:00 UTC
# Check health endpoint
curl -s http://localhost:3000/health | jq .
# Check readiness
curl -s http://localhost:3000/ready | jq .
# Check current version
curl -s http://localhost:3000/api/v1/version | jq .Document Current State:
Current Version: v1.2.3
Commit Hash: abc123def456
Uptime: 5 days
Health Status: healthy
Active Connections: 45
# Check CPU and memory
top -b -n 1 | grep node
# Check disk space
df -h /var/log /app
# Check open files
lsof -p $(pgrep -f yieldvault-backend) | wc -l# Navigate to application directory
cd /app/yieldvault-backend
# Document current commit
git rev-parse HEAD > /var/backups/yieldvault/last_deployed_commit.txt
echo "$(date -Iseconds)" >> /var/backups/yieldvault/last_deployed_commit.txt
# Backup current environment file
cp .env .env.backup.$(date +%Y%m%d_%H%M%S)
# Backup current node_modules (optional, for quick rollback)
tar -czf /var/backups/yieldvault/node_modules_$(date +%Y%m%d_%H%M%S).tar.gz node_modules/# Create maintenance flag file
touch /app/yieldvault-backend/MAINTENANCE_MODE
# Or update load balancer to show maintenance page
# (depends on your infrastructure)# If using systemd
sudo systemctl stop yieldvault-backend
# If using PM2
pm2 stop yieldvault-backend
# If using Docker
docker stop yieldvault-backend
# Wait for graceful shutdown (max 30 seconds)
sleep 5# Check process is not running
ps aux | grep yieldvault-backend | grep -v grep
# Should return nothing
# Verify port is free
netstat -tuln | grep :3000
# Should return nothingIf service won't stop:
# Force kill (last resort)
pkill -9 -f yieldvault-backend
# Or for Docker
docker kill yieldvault-backend# Navigate to application directory
cd /app/yieldvault-backend
# Fetch latest changes
git fetch origin
# Checkout target version
# For latest: git checkout origin/main
# For specific version: git checkout v1.3.0
# For specific commit: git checkout abc123def456
git checkout origin/main
# Verify correct version
git log -1 --oneline
# Install dependencies
npm ci --production
# Build TypeScript
npm run build
# Verify build succeeded
ls -la dist/Expected Output:
HEAD is now at abc123d feat: add new feature
dist/
├── index.js
├── auth.js
├── database.js
└── ...
# Pull latest image
docker pull ghcr.io/yieldvault/backend:latest
# Or specific version
docker pull ghcr.io/yieldvault/backend:v1.3.0
# Verify image
docker images | grep yieldvault/backend
# Remove old container
docker rm yieldvault-backend
# Start new container
docker run -d \
--name yieldvault-backend \
--restart unless-stopped \
-p 3000:3000 \
--env-file /app/yieldvault-backend/.env \
-v /var/log/yieldvault:/var/log/yieldvault \
ghcr.io/yieldvault/backend:latest# Download artifact from CI/CD
aws s3 cp s3://yieldvault-artifacts/backend/v1.3.0.tar.gz /tmp/
# Extract
cd /app
tar -xzf /tmp/v1.3.0.tar.gz
# Install dependencies (if not included)
cd /app/yieldvault-backend
npm ci --production# Check .env file exists
ls -la /app/yieldvault-backend/.env
# Verify critical variables are set
grep -E "STELLAR_RPC_URL|VAULT_CONTRACT_ID|DATABASE_URL" .env
# Validate environment
node -e "
require('dotenv').config();
const required = ['STELLAR_RPC_URL', 'VAULT_CONTRACT_ID', 'DATABASE_URL'];
const missing = required.filter(k => !process.env[k]);
if (missing.length > 0) {
console.error('Missing:', missing);
process.exit(1);
}
console.log('✓ All required variables present');
"# Check migration status
npx prisma migrate status
# Run pending migrations
npx prisma migrate deploy
# Verify schema
npx prisma validateExpected Output:
Database schema is up to date!
✓ Schema is valid
# If using systemd
sudo systemctl start yieldvault-backend
# If using PM2
pm2 start yieldvault-backend
pm2 save
# If using Docker (already started in Step 4)
docker logs -f yieldvault-backend# Wait for service to be ready
for i in {1..30}; do
if curl -s http://localhost:3000/health > /dev/null; then
echo "✓ Service is up!"
break
fi
echo "Waiting for service... ($i/30)"
sleep 2
done# Check health endpoint
curl -s http://localhost:3000/health | jq .
# Expected response
{
"status": "healthy",
"timestamp": "2026-04-29T15:30:00.000Z",
"uptime": 5.2,
"environment": "production",
"checks": {
"api": "up",
"database": "up",
"stellarRpc": "up"
}
}# Check readiness endpoint
curl -s http://localhost:3000/ready | jq .
# Expected response
{
"ready": true,
"timestamp": "2026-04-29T15:30:00.000Z",
"dependencies": {
"database": true,
"stellarRpc": true
}
}# Check deployed version
curl -s http://localhost:3000/api/v1/version | jq .
# Or check git commit
cd /app/yieldvault-backend
git rev-parse HEAD
# Compare with expected version# Check application logs
tail -50 /var/log/yieldvault/backend.log
# Check for errors
grep -i error /var/log/yieldvault/backend.log | tail -20
# Check for warnings
grep -i warn /var/log/yieldvault/backend.log | tail -20
# If using systemd
journalctl -u yieldvault-backend -n 50
# If using PM2
pm2 logs yieldvault-backend --lines 50
# If using Docker
docker logs yieldvault-backend --tail 50Look for:
- ✓ "Server started on port 3000"
- ✓ "Database connected"
- ✓ "Stellar RPC connected"
- ✗ Any error messages
- ✗ Any stack traces
# Test vault summary
curl -s http://localhost:3000/api/v1/vault/summary | jq .
# Test vault metrics
curl -s http://localhost:3000/api/v1/vault/metrics | jq .
# Test health check
curl -s http://localhost:3000/health | jq .
# Test rate limiting (should work)
for i in {1..5}; do
curl -s http://localhost:3000/api/v1/vault/summary > /dev/null
echo "Request $i completed"
done# Test read operation
curl -s http://localhost:3000/api/v1/transactions?limit=1 | jq .
# Test write operation (if safe)
curl -X POST http://localhost:3000/admin/test-write \
-H "Authorization: ApiKey test-key" \
-H "Content-Type: application/json"# Test Stellar RPC connectivity
curl -s http://localhost:3000/api/v1/vault/total-assets | jq .
# Test email service (if applicable)
# curl -X POST http://localhost:3000/admin/test-email ...# Remove maintenance flag
rm -f /app/yieldvault-backend/MAINTENANCE_MODE
# Or update load balancer to enable traffic
# (depends on your infrastructure)# Test from external IP (if different from localhost)
curl -s https://api.yieldvault.finance/health | jq .
# Test through load balancer
curl -s https://api.yieldvault.finance/api/v1/vault/summary | jq .# Watch logs for errors
tail -f /var/log/yieldvault/backend.log | grep -i error
# Monitor resource usage
watch -n 5 'ps aux | grep yieldvault-backend'
# Check error rate
# (depends on your monitoring setup)# Document deployment
cat >> /var/log/yieldvault/deployments.log <<EOF
Deployment: $(date -Iseconds)
Version: $(git rev-parse HEAD)
Deployed By: $(whoami)
Incident: INCIDENT-123
Status: Success
EOF# Send notification via Slack
curl -X POST $SLACK_WEBHOOK_URL \
-H 'Content-Type: application/json' \
-d '{
"text": "✅ Backend deployment completed successfully",
"blocks": [
{
"type": "section",
"text": {
"type": "mrkdwn",
"text": "*Backend Deployment Complete*\n• Version: v1.3.0\n• Downtime: 5 minutes\n• Status: All systems operational"
}
}
]
}'If deployment fails or causes issues:
# Stop current service
sudo systemctl stop yieldvault-backend
# Restore previous version
cd /app/yieldvault-backend
git checkout $(cat /var/backups/yieldvault/last_deployed_commit.txt | head -1)
# Restore environment
cp .env.backup.* .env
# Rebuild (if needed)
npm run build
# Start service
sudo systemctl start yieldvault-backend
# Verify
curl http://localhost:3000/health# Stop current container
docker stop yieldvault-backend
docker rm yieldvault-backend
# Start previous version
docker run -d \
--name yieldvault-backend \
--restart unless-stopped \
-p 3000:3000 \
--env-file /app/yieldvault-backend/.env \
ghcr.io/yieldvault/backend:v1.2.3Symptoms: Service fails to start or crashes immediately
Solutions:
- Check logs:
journalctl -u yieldvault-backend -n 100 - Verify environment variables:
node -e "require('dotenv').config(); console.log(process.env)" - Check port availability:
netstat -tuln | grep :3000 - Verify dependencies:
npm ls - Check file permissions:
ls -la /app/yieldvault-backend
Symptoms: "Cannot connect to database" errors
Solutions:
- Verify DATABASE_URL:
echo $DATABASE_URL - Test connection:
psql $DATABASE_URL -c "SELECT 1" - Check database is running:
systemctl status postgresql - Verify network connectivity:
ping database-host - Check firewall rules
Symptoms: Service uses excessive memory
Solutions:
- Check for memory leaks:
node --inspect - Restart service:
systemctl restart yieldvault-backend - Increase memory limit (if needed)
- Review recent code changes
- Enable heap snapshots for analysis
Symptoms: API endpoints respond slowly
Solutions:
- Check database performance:
EXPLAIN ANALYZE SELECT ... - Check Stellar RPC latency:
curl -w "@curl-format.txt" $STELLAR_RPC_URL - Review application logs for slow queries
- Check resource usage:
top,iostat - Enable performance profiling
After deployment, verify:
- Service is running
- Health check passes
- Readiness check passes
- Correct version deployed
- No errors in logs
- Database connectivity works
- Stellar RPC connectivity works
- Critical endpoints respond
- Rate limiting works
- External access works
- Monitoring shows healthy state
- Stakeholders notified
- Deployment documented
Document these metrics for each deployment:
| Metric | Target | Actual | Status |
|---|---|---|---|
| Stop Service | 2 min | ___ min | ___ |
| Deploy Code | 10 min | ___ min | ___ |
| Start Service | 2 min | ___ min | ___ |
| Verification | 5 min | ___ min | ___ |
| Total RTO | 30 min | ___ min | ___ |
| Errors During Deploy | 0 | ___ | ___ |
| Rollback Required | No | ___ | ___ |
For zero-downtime deployments:
# Deploy to green environment
ssh green-server
cd /app/yieldvault-backend
git pull
npm ci
npm run build
# Start green service on different port
PORT=3001 npm start
# Verify green is healthy
curl http://localhost:3001/health# Update load balancer to point to green
# (depends on your infrastructure)
# Verify traffic is flowing to green
curl https://api.yieldvault.finance/health
# Monitor for issues
# If issues, switch back to blue# Stop blue service
ssh blue-server
sudo systemctl stop yieldvault-backend
# Green becomes new blue for next deployment| Role | Name | Phone | |
|---|---|---|---|
| DevOps Lead | TBD | TBD | TBD |
| Backend Lead | TBD | TBD | TBD |
| On-Call Engineer | TBD | TBD | TBD |
| Team Lead | TBD | TBD | TBD |
PagerDuty: [Escalation Policy Link]
Slack Channel: #yieldvault-incidents
Last Updated: April 29, 2026
Next Review: July 29, 2026
Tested: