Problem
Sporadic 500 Internal Server Error responses for the /statement route in production, despite the route working correctly most of the time.
Investigation Findings
Local Reproduction Attempt
- Environment: Main branch, en-only mode (
PARAGLIDE_LOCALES=en)
- Test method: 30+ concurrent requests to
/statement route via netlify serve
- Result: All requests returned 200 status codes
Root Cause Analysis
The /statement route has graceful error handling that masks most API failures:
-
API Endpoint Behavior (/api/signatories):
try {
// Fetch from Airtable API
const records = await fetchAllPages(fetch, url)
// Process and return real data
} catch (e) {
console.error('Error fetching signatories:', e)
// Always returns HTTP 200 with fallback data
return json({
signatories: fallbackSignatories,
totalCount: 0
})
}
-
Page Loader (+page.ts):
const response = await fetch('api/signatories');
// No error handling - assumes API always returns 200
const { signatories, totalCount } = await response.json();
Key Insight
Why we see 200s despite errors: The API endpoint catches Airtable failures and returns HTTP 200 with placeholder data ("Error", "This should be" signatories) instead of throwing 500 errors.
Performance Observations
- Without Airtable API key: Sub-second responses (fallback data)
- With Airtable API key: 1-4 second responses (real API calls)
- Implication: Airtable API latency could contribute to timeout-related 500s
Likely Causes of Production 500s
Since our local testing with graceful degradation didn't reproduce 500s, the production failures likely occur from:
-
Infrastructure-level failures:
- Edge function timeouts before reaching API endpoint
- Netlify platform issues
- Memory/resource exhaustion
-
Race conditions under high load:
- Concurrent requests overwhelming edge function
- Airtable API rate limiting causing different error paths
-
Unhandled JavaScript errors:
- Runtime errors that bypass the try/catch in API endpoint
- Edge function bundling/execution issues
-
Network/timeout issues:
- Airtable API taking >4 seconds (production timeout threshold)
- DNS resolution failures
Next Steps (Claude's suggestions - not yet endorsed)
-
Add error handling to page loader:
try {
const response = await fetch('api/signatories');
if (\!response.ok) throw new Error(`API error: ${response.status}`);
const data = await response.json();
} catch (e) {
// Handle API failures gracefully
}
-
Monitor edge function logs in production for specific error patterns
-
Consider caching strategy for signatory data to reduce Airtable API dependency
-
Add timeout handling for slow Airtable API responses
Environment Details
- Branch tested:
main
- Local tool:
netlify serve
- Configuration: English-only mode
- Airtable access: Read-only API key configured
Problem
Sporadic 500 Internal Server Error responses for the
/statementroute in production, despite the route working correctly most of the time.Investigation Findings
Local Reproduction Attempt
PARAGLIDE_LOCALES=en)/statementroute vianetlify serveRoot Cause Analysis
The
/statementroute has graceful error handling that masks most API failures:API Endpoint Behavior (
/api/signatories):Page Loader (
+page.ts):Key Insight
Why we see 200s despite errors: The API endpoint catches Airtable failures and returns HTTP 200 with placeholder data ("Error", "This should be" signatories) instead of throwing 500 errors.
Performance Observations
Likely Causes of Production 500s
Since our local testing with graceful degradation didn't reproduce 500s, the production failures likely occur from:
Infrastructure-level failures:
Race conditions under high load:
Unhandled JavaScript errors:
Network/timeout issues:
Next Steps (Claude's suggestions - not yet endorsed)
Add error handling to page loader:
Monitor edge function logs in production for specific error patterns
Consider caching strategy for signatory data to reduce Airtable API dependency
Add timeout handling for slow Airtable API responses
Environment Details
mainnetlify serve