Add propertiesSupplier support to EmfMetricLoggingPublisher#6735
Add propertiesSupplier support to EmfMetricLoggingPublisher#6735humanzz wants to merge 7 commits intoaws:masterfrom
Conversation
Enable `EmfMetricLoggingPublisher` to accept an optional `Supplier<Map<String, String>>` for enriching EMF records with custom properties at publish time - Add `propertiesSupplier` field and builder method to `EmfMetricLoggingPublisher.Builder` - Add `propertiesSupplier` field, accessor, and builder setter to `EmfMetricConfiguration`, defaulting to empty map when null - Add `resolveProperties()` to `MetricEmfConverter` which invokes the supplier from config once per convert call, handling null returns and exceptions gracefully - Add `writeCustomProperties()` to `MetricEmfConverter` which writes properties first in the EMF JSON so `_aws`, dimensions, and metrics overwrite any key collisions Closes aws#6595
| * or {@code null} to disable custom properties | ||
| * @return this builder | ||
| */ | ||
| public Builder propertiesSupplier(Supplier<Map<String, String>> propertiesSupplier) { |
There was a problem hiding this comment.
Hmm what is the advantage of taking a Supplier as opposed to a Map directly in this case?
There was a problem hiding this comment.
Because the properties are not necessarily static and known at publisher initialization time. An example is if I want to include a request id to correlate to other log lines/metrics within the same overall request
There was a problem hiding this comment.
Yeah that makes sense but it seems like it would be hard in many scenarios to correctly determine that data, especially in a multi tenant or highly concurrent environment given that it doesn't have take any input.
There was a problem hiding this comment.
I'm taking inspiration but simplifying what's happening in aws-embedded-metrics-java for injecting ambient context into EMF output.
In that library, the Environment interface defines a configureContext(MetricsContext) method that each environment implementation uses to enrich EMF records with runtime properties. For example, LambdaEnvironment.configureContext() injects executionEnvironment, functionVersion, logStreamId, and traceId by reading environment variables — notably, it also takes no per-request input, it just reads from ambient state. These properties end up as top-level keys in the EMF JSON.
The Supplier<Map<String, String>> is the same concept, just simplified. Instead of requiring users to implement a full Environment interface, we provide a lightweight Supplier callback that does the same thing — provide ambient context at publish time. The supplier implementer would have the responsibility to define their own strategy for dealing with multi-tenancy or concurrency, if required — though in Lambda environments, where this publisher makes the most sense, each invocation runs in its own execution context so ambient state is naturally request-scoped.
There was a problem hiding this comment.
Also, another thing I wanna call out — I went with Map<String, String> rather than Map<String, Object>. The EMF spec allows any valid JSON type for root node members, and the aws-embedded-metrics-java library uses Map<String, Object> for its putProperty. I chose String to keep the implementation simple — the primary use case is contextual metadata like request IDs and trace IDs, and String covers that well. Using Object would require type-dispatching in the serializer since JsonWriter has separate overloads for String, long, double, boolean, etc., plus handling for unsupported types. If needed, we can always widen to Object later without breaking backward compatibility (narrowing would be a breaking change). Happy to change if you'd prefer Object from the start.
There was a problem hiding this comment.
Thanks for the detailed reply!
- I did take at the interface you liked previously, and I agree on the
Map<String, String>overMap<String, Object>as it does greatly simplify the interface while addressing what we foresee as the majority use case. And as you said, it's not a one-way door. - I agree that in a Lambda environment, the
Supplieris more than sufficient; I was imagining other scenarios where the interface might benefit from information in order to compute the correct map values to enter. One thing I could think of is if a user may want to emit some high cardinality information like ServiceEndpoint as a property, which would be difficult to plumb through with this interface. I think even in this case though this isn't a one way door since we could overload it with aFunctionor something similar in the future. Let me take this discussion back to the rest of the team to get their input and get back to you.
| int customIndex = emfLog.indexOf("\"OperationName\":\"should-be-overwritten\""); | ||
| int realIndex = emfLog.indexOf("\"OperationName\":\"GetObject\""); | ||
| assertThat(customIndex).isGreaterThanOrEqualTo(0); | ||
| assertThat(realIndex).isGreaterThan(customIndex); |
There was a problem hiding this comment.
It seems like we rely on explicit ordering within the EMF object to ensure that the correct dimension is picked up by CloudWatch, which seems brittle. Can we instead address this upstream so that the writer doesn't write custom properties that have names that collide?
There was a problem hiding this comment.
You're absolutely right, relying on write order was brittle and actually produced duplicate keys in the JSON output. That was a miss on my end. I'll push a fix that filters colliding keys upstream in writeCustomProperties before writing, so reserved keys (_aws, dimension names, metric names) are simply skipped. The collision tests will also be updated accordingly.
Closes #6595
Enable
EmfMetricLoggingPublisherto accept an optionalSupplier<Map<String, String>>for enriching EMF records with custom properties at publish time.Motivation and Context
Users of
EmfMetricLoggingPublisherhave no way to add custom metadata (e.g. Lambda request IDs, trace IDs) to EMF records. The aws-embedded-metrics-java library supports this viaputProperty(), but the SDK's built-in publisher does not. This change adds apropertiesSupplierto the builder, evaluated on eachpublish()call, allowing dynamic enrichment of EMF output with searchable properties in CloudWatch Logs Insights. As a user of Lambda Powertools, powertools add default properties to logs/metrics, to allow correlating them to Lambda Invocation's request id. The main property I want to use this is request id (Lambda invocation), so I can correlate the metric records to specific invocations similar to what one gets for free from Powertools.Modifications
propertiesSupplierfield and builder method toEmfMetricLoggingPublisher.BuilderpropertiesSupplierfield, accessor, and builder setter toEmfMetricConfiguration, defaulting to empty map when nullresolveProperties()toMetricEmfConverterwhich invokes the supplier from config once per convert call, handling null returns and exceptions gracefullywriteCustomProperties()toMetricEmfConverterwhich writes custom properties to the EMF JSON, silently skipping any keys that collide with_aws, dimension names, or metric namesTesting
Added 7 tests to
MetricEmfConverterTest:_awsis silently skipped —_awsmetadata object is preservedAdded 3 tests to
EmfMetricLoggingPublisherTest:All 26 tests pass.
mvn clean install -pl :emf-metric-logging-publishersucceeds (compile, checkstyle, tests, javadoc, API compatibility).Screenshots (if appropriate)
N/A
Types of changes
Checklist
mvn installsucceedsscripts/new-changescript and following the instructions. Commit the new file created by the script in.changes/next-releasewith your changes.License