Skip to content

Harden telemetry advice fail-open behavior#10

Open
ConanScott wants to merge 3 commits into
Axway-API-Management-Plus:mainfrom
ConanScott:harden/fail-open-telemetry
Open

Harden telemetry advice fail-open behavior#10
ConanScott wants to merge 3 commits into
Axway-API-Management-Plus:mainfrom
ConanScott:harden/fail-open-telemetry

Conversation

@ConanScott
Copy link
Copy Markdown
Contributor

Summary

  • remove eager/static OpenTelemetry initialization from the woven HTTP server/client helpers
  • add guarded lazy telemetry initialization so setup failures disable telemetry instead of breaking APIM request handling
  • keep the original APIM join point as the source of truth: telemetry setup/recording failures call or preserve pjp.proceed() behavior
  • log a controlled OpenTelemetry failure/disable message rather than throwing telemetry failures into gateway traffic

Validation

Tested on an APIM EC2 host after building the assembled install set with:

./gradlew clean assembleInstall -Papim_folder=/opt/Axway/apigateway/system

With the complete runtime jar set copied into apigateway/ext/lib, Gateway started, ANM login worked, traffic flowed normally, expected OpenTelemetry logging was present, and no errors were logged.

Then opentelemetry-instrumentation-api-2.18.1.jar was moved aside and APIM was restarted. Gateway still started, ANM login worked, and traffic flowed without errors. Telemetry was not recorded in that broken-dependency case, which is the intended fail-open behavior.

Notes

This is a customer-safety hardening change: telemetry should be best-effort and must not turn dependency/setup failures into API Gateway request failures.

@ConanScott ConanScott marked this pull request as ready for review May 19, 2026 04:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant