Describe the bug
DatabaseAccount.getConsistencyPolicy() (and the three parallel lazy getters getReplicationPolicy(), getSystemReplicationPolicy(), getQueryEngineConfiguration()) caches its result in a non-volatile, non-final field with no synchronisation. The cached object itself extends JsonSerializable, whose propertyBag field is also non-volatile and non-final. The combination unsafely publishes a partially-constructed ConsistencyPolicy to other threads: a concurrent reader can observe consistencyPolicy != null while consistencyPolicy.propertyBag == null. The next call to getDefaultConsistencyLevel() then NPEs at JsonSerializable.getWithMapping dereferencing this.propertyBag.get(propertyName).
The race re-creates every time GlobalEndpointManager replaces latestDatabaseAccount with a freshly-constructed instance during its background refresh, since the four lazy fields are null again on the new instance.
Concretely in 4.80.0:
|
public ConsistencyPolicy getConsistencyPolicy() { |
|
if (this.consistencyPolicy == null) { |
|
this.consistencyPolicy = super.getObject(Constants.Properties.USER_CONSISTENCY_POLICY, |
|
ConsistencyPolicy.class); |
|
|
|
if (this.consistencyPolicy == null) { |
|
this.consistencyPolicy = new ConsistencyPolicy(); |
|
} |
|
} |
|
return this.consistencyPolicy; |
|
} |
|
transient ObjectNode propertyBag = null; |
|
public JsonSerializable(ObjectNode objectNode) { |
|
this.propertyBag = objectNode; |
|
} |
|
private <T> T getWithMapping(String propertyName, Function<JsonNode, T> mapper) { |
|
JsonNode node = this.propertyBag.get(propertyName); |
|
if (node == null || node.isNull()) { |
|
return null; |
|
} |
|
|
|
return mapper.apply(node); |
|
} |
|
private ConsistencyPolicy consistencyPolicy; |
IIUC, without final on propertyBag or volatile on consistencyPolicy, the JMM doesn't establish a happens-before boundary between the constructor write to propertyBag and a reader that observes the published consistencyPolicy reference. A reader can see the new consistencyPolicy while the constructor write to propertyBag hasn't been propagated, producing the NPE. I think this is covered in the JSR 133 FAQ.
The same broad class of unsafe lazy initialisation was flagged in #9107 and resolved by dropping the cache entirely. IIUC, that specific fix doesn't translate here though... each DatabaseAccount instance's lazy fields are derived from its own propertyBag.
Exception or Stack Trace
java.lang.NullPointerException: Cannot invoke "com.fasterxml.jackson.databind.node.ObjectNode.get(String)" because "this.propertyBag" is null
at com.azure.cosmos.implementation.JsonSerializable.getWithMapping(JsonSerializable.java:417)
at com.azure.cosmos.implementation.JsonSerializable.getString(JsonSerializable.java:356)
at com.azure.cosmos.implementation.ConsistencyPolicy.getDefaultConsistencyLevel(ConsistencyPolicy.java:57)
at com.azure.cosmos.implementation.directconnectivity.GatewayServiceConfigurationReader.getDefaultConsistencyLevel(GatewayServiceConfigurationReader.java:40)
at com.azure.cosmos.implementation.directconnectivity.RequestHelper.getReadConsistencyStrategyToUse(RequestHelper.java:43)
at com.azure.cosmos.implementation.directconnectivity.ConsistencyReader.deduceReadMode(ConsistencyReader.java:344)
at com.azure.cosmos.implementation.directconnectivity.ConsistencyReader.readAsync(ConsistencyReader.java:207)
at com.azure.cosmos.implementation.directconnectivity.ReplicatedResourceClient.invokeAsync(ReplicatedResourceClient.java:160)
at com.azure.cosmos.implementation.directconnectivity.ReplicatedResourceClient.getStoreResponseMono(ReplicatedResourceClient.java:136)
... (reactor scaffolding) ...
at com.azure.cosmos.implementation.query.Paginator.lambda$getPaginatedQueryResultAsObservable$1(Paginator.java:134)
... (reactor scaffolding) ...
at reactor.core.publisher.BlockingIterable.iterator(BlockingIterable.java:86)
at <caller code consuming queryChangeFeed pages>
To Reproduce
The bug is non-deterministic and requires concurrent reads coinciding with GlobalEndpointManager refreshing latestDatabaseAccount. It manifests reliably in production after enough hours of traffic. I haven't been able to land a tight JCStress harness for it in the time I had: the race window between refreshLocationPrivateAsync publishing a new DatabaseAccount and the first concurrent getConsistencyPolicy() call against that instance is small. The reasoning in Describe the bug above should be enough to verify by inspection.
The pattern that matches the observed cases:
- Run any non-trivial concurrent Cosmos workload (multiple in-flight reads/queries).
- Wait for
GlobalEndpointManager to refresh latestDatabaseAccount (default ~5 min, plus on failover triggers).
- With low probability, a concurrent reader hits the NPE above on its first contact with the new
DatabaseAccount.
Code Snippet
To be clear, I haven't tried this but in principle....
CosmosClient client = new CosmosClientBuilder()
.endpoint(endpoint)
.credential(tokenCredential)
.buildClient();
// On N virtual threads, every few seconds per thread:
for (int i = 0; i < N; i++) {
Thread.ofVirtual().start(() -> {
// Any operation routing through ReadConsistencyStrategy resolution.
container.queryChangeFeed(
CosmosChangeFeedRequestOptions.createForProcessingFromNow(FeedRange.forFullRange()),
JsonNode.class
).iterableByPage().forEach(page -> { /* consume */ });
});
}
Expected behavior
Concurrent reads of getConsistencyPolicy() (and its three siblings) should never observe a partially-constructed ConsistencyPolicy.
Some possible fixes include:
- Make
propertyBag final in JsonSerializable (it's only assigned from constructors). This fixes a source of raciness for every subclass at once.
- Make
consistencyPolicy, replicationPolicy, systemReplicationPolicy, queryEngineConfiguration volatile in DatabaseAccount. Maybe worth combining with (1).
- Eagerly initialise these four fields in
DatabaseAccount(ObjectNode) instead of lazily, and make them final. The values are derived from the constructor argument, and final-field semantics remove the race altogether. Unsure though if this is ok (e.g. is it lazy for performance ? or perhaps to not was cycles initialising stuff that is not used?).
- Return new values of these each time.
Gut feeling is (1) feels right.
Screenshots
N/A
Setup (please complete the following information):
- OS: Linux 6.12 (Amazon Linux 2023, aarch64 / AWS Graviton)
- IDE: IntelliJ
- Library/Libraries:
com.azure:azure-cosmos:4.80.0, com.azure:azure-identity:1.18.3
- Java version: JRE 25.0.3
- App Server/Environment: Quarkus 3.35 on Kubernetes
- Frameworks: Quarkus
Exact user-agent strings from the failing spans:
azsdk-java-cosmos/4.80.0 Linux/6.12.68-92.122.amzn2023.aarch64 JRE/25.0.3
azsdk-java-cosmos/4.80.0 Linux/6.12.73-95.123.amzn2023.aarch64 JRE/25.0.3
Additional context
A few signals from 180 days of production traces:
- 7 NPEs across the fleet over 180 days. We run heavily on aarch64, and all 7 happened on aarch64 hosts. The sample is too small and our hardware mix too skewed to assert x86 is unaffected, but a JMM happens-before bug like this would naturally surface more easily on weaker memory architectures.
- All 7 land in change-feed paging. That's where the operation count is by far the largest, not because change feed has a unique code path.
- Spread across 4 different builds of our service over those 180 days, so not a regression in any specific release.
A startup-time eager getConsistencyPolicy() workaround doesn't help, because GlobalEndpointManager constructs a fresh DatabaseAccount on every background refresh and the four lazy fields are null again on the new instance. The only client-side mitigation I can think of is reflectively pre-warming the fields after every refresh, which is worse than the bug 😬.
Information Checklist
Kindly make sure that you have added all the following information above and checkoff the required fields otherwise we will treat the issuer as an incomplete report
Describe the bug
DatabaseAccount.getConsistencyPolicy()(and the three parallel lazy gettersgetReplicationPolicy(),getSystemReplicationPolicy(),getQueryEngineConfiguration()) caches its result in a non-volatile, non-finalfield with no synchronisation. The cached object itself extendsJsonSerializable, whosepropertyBagfield is also non-volatileand non-final. The combination unsafely publishes a partially-constructedConsistencyPolicyto other threads: a concurrent reader can observeconsistencyPolicy != nullwhileconsistencyPolicy.propertyBag == null. The next call togetDefaultConsistencyLevel()then NPEs atJsonSerializable.getWithMappingdereferencingthis.propertyBag.get(propertyName).The race re-creates every time
GlobalEndpointManagerreplaceslatestDatabaseAccountwith a freshly-constructed instance during its background refresh, since the four lazy fields are null again on the new instance.Concretely in 4.80.0:
azure-sdk-for-java/sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/DatabaseAccount.java
Lines 141 to 151 in 260591e
azure-sdk-for-java/sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/JsonSerializable.java
Line 68 in 260591e
azure-sdk-for-java/sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/JsonSerializable.java
Lines 101 to 103 in 260591e
azure-sdk-for-java/sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/JsonSerializable.java
Lines 416 to 423 in 260591e
azure-sdk-for-java/sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/DatabaseAccount.java
Line 19 in 260591e
IIUC, without
finalonpropertyBagorvolatileonconsistencyPolicy, the JMM doesn't establish a happens-before boundary between the constructor write topropertyBagand a reader that observes the publishedconsistencyPolicyreference. A reader can see the newconsistencyPolicywhile the constructor write topropertyBaghasn't been propagated, producing the NPE. I think this is covered in the JSR 133 FAQ.The same broad class of unsafe lazy initialisation was flagged in #9107 and resolved by dropping the cache entirely. IIUC, that specific fix doesn't translate here though... each
DatabaseAccountinstance's lazy fields are derived from its ownpropertyBag.Exception or Stack Trace
To Reproduce
The bug is non-deterministic and requires concurrent reads coinciding with
GlobalEndpointManagerrefreshinglatestDatabaseAccount. It manifests reliably in production after enough hours of traffic. I haven't been able to land a tight JCStress harness for it in the time I had: the race window betweenrefreshLocationPrivateAsyncpublishing a newDatabaseAccountand the first concurrentgetConsistencyPolicy()call against that instance is small. The reasoning in Describe the bug above should be enough to verify by inspection.The pattern that matches the observed cases:
GlobalEndpointManagerto refreshlatestDatabaseAccount(default ~5 min, plus on failover triggers).DatabaseAccount.Code Snippet
To be clear, I haven't tried this but in principle....
Expected behavior
Concurrent reads of
getConsistencyPolicy()(and its three siblings) should never observe a partially-constructedConsistencyPolicy.Some possible fixes include:
propertyBagfinalinJsonSerializable(it's only assigned from constructors). This fixes a source of raciness for every subclass at once.consistencyPolicy,replicationPolicy,systemReplicationPolicy,queryEngineConfigurationvolatileinDatabaseAccount. Maybe worth combining with (1).DatabaseAccount(ObjectNode)instead of lazily, and make themfinal. The values are derived from the constructor argument, and final-field semantics remove the race altogether. Unsure though if this is ok (e.g. is it lazy for performance ? or perhaps to not was cycles initialising stuff that is not used?).Gut feeling is (1) feels right.
Screenshots
N/A
Setup (please complete the following information):
com.azure:azure-cosmos:4.80.0,com.azure:azure-identity:1.18.3Exact user-agent strings from the failing spans:
azsdk-java-cosmos/4.80.0 Linux/6.12.68-92.122.amzn2023.aarch64 JRE/25.0.3azsdk-java-cosmos/4.80.0 Linux/6.12.73-95.123.amzn2023.aarch64 JRE/25.0.3Additional context
A few signals from 180 days of production traces:
A startup-time eager
getConsistencyPolicy()workaround doesn't help, becauseGlobalEndpointManagerconstructs a freshDatabaseAccounton every background refresh and the four lazy fields are null again on the new instance. The only client-side mitigation I can think of is reflectively pre-warming the fields after every refresh, which is worse than the bug 😬.Information Checklist
Kindly make sure that you have added all the following information above and checkoff the required fields otherwise we will treat the issuer as an incomplete report