chore: Handle Fallback From FDv2 to FDv1#339
Conversation
Add an fdv1Fallback boolean to FDv2SourceResult and FDv2PayloadResponse so that FDv2 data sources can signal when the server's x-ld-fd-fallback header indicates a fallback to FDv1. Existing factory methods retain their signatures and default to false; new overloads accept the flag explicitly. Made-with: Cursor
Extend ModeDefinition and ResolvedModeDefinition with a separate list of FDv1 fallback synchronizer factories. These are kept distinct from the regular synchronizer list so that the orchestration layer can manage their blocked/unblocked state independently. The existing two-argument constructor delegates to the new three-argument form with an empty fallback list, preserving backward compatibility. Made-with: Cursor
Read the x-ld-fd-fallback response header in the polling requestor and streaming synchronizer, and propagate the fallback signal through every FDv2SourceResult produced by these components. The header is checked case-insensitively and carried on both success and error paths so the orchestration layer can act on it regardless of how the response was classified. Made-with: Cursor
Implement a standalone FDv1 polling synchronizer that fetches flag data from the legacy /sdk/evalx endpoints and produces FDv2SourceResult values. This allows the FDv2 orchestration layer to fall back to FDv1 polling without depending on the existing PollingDataSource / HttpFeatureFlagFetcher stack, which is tightly coupled to the FDv1 DataSourceUpdateSink. Uses LDUtil.isHttpErrorRecoverable() for FDv2-style error classification (INTERRUPTED vs TERMINAL_ERROR). Made-with: Cursor
Connect the fallback signal to the orchestration layer end-to-end. The FDv2DataSource constructor now accepts an optional list of FDv1 fallback synchronizer factories, wrapping them as blocked entries appended after the regular synchronizers. When a synchronizer result carries fdv1Fallback=true, the run loop triggers SourceManager's fdv1Fallback() method, which blocks all FDv2 synchronizers, unblocks the FDv1 slot, and resets the scan index under the active source lock. The builder configures an FDv1PollingSynchronizer in the default mode table for STREAMING, POLLING, and BACKGROUND modes (not OFFLINE or ONE_SHOT, which have no synchronizers to fall back from). Made-with: Cursor
...droid-client-sdk/src/main/java/com/launchdarkly/sdk/android/subsystems/FDv2SourceResult.java
Show resolved
Hide resolved
launchdarkly-android-client-sdk/src/main/java/com/launchdarkly/sdk/android/HeaderConstants.java
Show resolved
Hide resolved
launchdarkly-android-client-sdk/src/main/java/com/launchdarkly/sdk/android/ModeDefinition.java
Show resolved
Hide resolved
launchdarkly-android-client-sdk/src/main/java/com/launchdarkly/sdk/android/FDv2DataSource.java
Show resolved
Hide resolved
| logger.info("Server signaled FDv1 fallback; switching to FDv1 polling synchronizer."); | ||
| sourceManager.fdv1Fallback(); | ||
| running = false; | ||
| } |
There was a problem hiding this comment.
Notes from code review:
- Will running ever be false here (at the beginning of the if statement)?
There was a problem hiding this comment.
I've added a comment to this block. There are a couple situations where running is false: during SHUTDOWN, and as a result of TERMINAL_ERROR.
| } | ||
| } | ||
|
|
||
| private FDv2SourceResult doPoll() { |
There was a problem hiding this comment.
Code review notes:
- Can we safely refactor the production FDv1 code to reuse this entire doPoll() function?
- Similarly, can we reuse the existing code on lines 87-101?
There was a problem hiding this comment.
I haven't addressed this yet
There was a problem hiding this comment.
In my latest push, I've refactored this new FDv1PollingSynchronizer by injecting a FeatureFetcher object. That's the same helper class that the existing PollingDataSource uses 👍
Remove single-argument overloads from FDv2SourceResult, FDv2PayloadResponse, and ModeDefinition, keeping only the versions that include the fdv1Fallback parameter. Update all call sites across production and test code. Made-with: Cursor
…nstead of list Aligns with java-core and js-core implementations, which both model FDv1 fallback as a single optional synchronizer rather than a list. Made-with: Cursor
|
@tanderson-ld In the very last commit (titled "persist fdv1 fallback state ..."), I addressed your concern about staying in "FDv1 fallback mode" when the platform state changes. Interestingly, this type of protection isn't present in the js-core implementation. I'll bring this up w/ Ryan tomorrow to see if he's aware of that gap, or if he chose not to make those changes for some reason. |
...rkly-android-client-sdk/src/main/java/com/launchdarkly/sdk/android/DefaultFDv2Requestor.java
Show resolved
Hide resolved
...kly-android-client-sdk/src/main/java/com/launchdarkly/sdk/android/FDv2DataSourceBuilder.java
Outdated
Show resolved
Hide resolved
Check for x-ld-fd-fallback during the initializer phase, not just during synchronization. When an initializer response carries the fallback signal, skip remaining initializers and transition directly to the FDv1 synchronizer. Aligns with js-core's implementation. Made-with: Cursor
b162d60 to
dddc21f
Compare
…onses The notModified() factory previously hardcoded fdv1Fallback to false, silently discarding the x-ld-fd-fallback header on 304 responses. Now the computed value is forwarded, aligning with the CSFDV2 spec and js-core behavior. Made-with: Cursor
Mirror how PollingDataSource delegates HTTP concerns to a shared FeatureFetcher rather than managing its own OkHttp client and URI construction. The synchronizer now calls fetcher.fetch() for the raw JSON and handles only the FDv2 adaptation (ChangeSet conversion, error classification, scheduling). Made-with: Cursor
launchdarkly-android-client-sdk/src/main/java/com/launchdarkly/sdk/android/FDv2DataSource.java
Outdated
Show resolved
Hide resolved
…mpty selector The non-empty selector early return in runInitializers() bypassed the fdv1Fallback check, so a polling initializer that returned a fully current payload with x-ld-fd-fallback: true would silently proceed to FDv2 synchronizers instead of switching to the FDv1 fallback. Made-with: Cursor
launchdarkly-android-client-sdk/src/main/java/com/launchdarkly/sdk/android/FDv2DataSource.java
Show resolved
Hide resolved
…hronizer loop The fdv1Fallback check required the synchronizer to still be running, but TERMINAL_ERROR set running=false first, silently discarding the fallback signal. Removed the running guard so a terminal error response carrying x-ld-fd-fallback: true correctly transitions to the FDv1 synchronizer. Also added a test for the equivalent initializer path. Made-with: Cursor
…chronizer The synchronizer always owns its fetcher in production, and the test mock's close() is already a no-op, so the flag was unnecessary. Made-with: Cursor
tanderson-ld
left a comment
There was a problem hiding this comment.
Still reviewing, but posting current comments at my end of day. Will resume tomorrow.
...y-android-client-sdk/src/main/java/com/launchdarkly/sdk/android/FDv1PollingSynchronizer.java
Outdated
Show resolved
Hide resolved
...y-android-client-sdk/src/main/java/com/launchdarkly/sdk/android/FDv1PollingSynchronizer.java
Outdated
Show resolved
Hide resolved
...y-android-client-sdk/src/main/java/com/launchdarkly/sdk/android/FDv1PollingSynchronizer.java
Show resolved
Hide resolved
...y-android-client-sdk/src/main/java/com/launchdarkly/sdk/android/FDv1PollingSynchronizer.java
Outdated
Show resolved
Hide resolved
...y-android-client-sdk/src/main/java/com/launchdarkly/sdk/android/FDv1PollingSynchronizer.java
Outdated
Show resolved
Hide resolved
...y-android-client-sdk/src/main/java/com/launchdarkly/sdk/android/FDv1PollingSynchronizer.java
Show resolved
Hide resolved
...y-android-client-sdk/src/main/java/com/launchdarkly/sdk/android/FDv1PollingSynchronizer.java
Outdated
Show resolved
Hide resolved
… FDv1PollingSynchronizer Process all results and errors directly inside the FeatureFetcher callback so the future carries a fully-formed FDv2SourceResult. This eliminates the need to unwrap ExecutionException to inspect application-level error types, keeping error classification at the callback layer where it belongs. Made-with: Cursor
launchdarkly-android-client-sdk/src/main/java/com/launchdarkly/sdk/android/FDv2DataSource.java
Show resolved
Hide resolved
launchdarkly-android-client-sdk/src/main/java/com/launchdarkly/sdk/android/FDv2DataSource.java
Show resolved
Hide resolved
...kly-android-client-sdk/src/main/java/com/launchdarkly/sdk/android/FDv2DataSourceBuilder.java
Outdated
Show resolved
Hide resolved
launchdarkly-android-client-sdk/src/main/java/com/launchdarkly/sdk/android/FDv2DataSource.java
Outdated
Show resolved
Hide resolved
| initialDelayMillis, | ||
| pollIntervalMillis, | ||
| TimeUnit.MILLISECONDS); | ||
| } |
There was a problem hiding this comment.
Shared single-thread executor risks deadlock with blocking poll
Medium Severity
FDv1PollingSynchronizer schedules its recurring poll task on inputs.getSharedExecutor() — the same 4-thread pool used by FDv2DataSource's main loop, condition timers, and other synchronizers. The doPoll() method blocks the executor thread via resultFuture.get() while waiting for the async HTTP callback. Combined with the main data source loop also occupying a thread on the same pool, this could lead to thread starvation or deadlock under load, especially during the transition period when both FDv2 and FDv1 components are active.
There was a problem hiding this comment.
I think this problem is theoretical only. In practice, when the fallback synchronizer is in use, all the other initializers and synchronizers will be closed. In other words, only one synchronizer runs at a time.
Regarding the statement "... especially during the transition period when both FDv2 and FDv1 components are active": The old synchronizer is fully closed before the new one is even built. There's no window where both are running simultaneously. I'm not sure what "transition period" is referring to.
There was a problem hiding this comment.
Maybe see if you can figure out why Cursor bugbot thinks and other synchronizers is possible at the same time.
There was a problem hiding this comment.
I asked Cursor to analyze this statement a few different ways. In each case, it seems like Cursor generally knew how the shared executor worked because of the comments in that class's definition. But when it made this comment on my PR, it hadn't taken one step further to understand exactly how many threads were being used, and how the threads are being juggled.
After writing this, it seems like adding a little bit more to the comments may help in the future.
Move the FDv1 fallback check to a single location before the result type switch statement. This ensures the fallback signal is honored regardless of selector state or result type, eliminating duplicated logic and a potential path where tryCompleteStart could be skipped. Made-with: Cursor
| * | ||
| * @return a FDv1 polling synchronizer builder | ||
| */ | ||
| public static FDv1PollingSynchronizerBuilder fdv1PollingSynchronizer() { |
There was a problem hiding this comment.
We don't want this to be public since fdv1 fallback is an LD internal concern. I don't think you necessarily have to put this in this class, but it should not be public if so.
There was a problem hiding this comment.
In that last push, I've moved makeDefaultModeTable() from DataSystemBuilder to DataSystemComponents. Then I updated the code so FDv1PollingSynchronizerBuilderImpl now implements DataSourceBuilder directly, instead of extending this FDv1PollingSynchronizerBuilder (which I deleted).
| initialDelayMillis, | ||
| pollIntervalMillis, | ||
| TimeUnit.MILLISECONDS); | ||
| } |
There was a problem hiding this comment.
Maybe see if you can figure out why Cursor bugbot thinks and other synchronizers is possible at the same time.
| if (result.isFdv1Fallback() | ||
| && sourceManager.hasFDv1Fallback() | ||
| && !sourceManager.isCurrentSynchronizerFDv1Fallback()) { | ||
| logger.info("Server signaled FDv1 fallback; switching to FDv1 polling synchronizer."); |
There was a problem hiding this comment.
Make private const for this message and commonize with other log location.
…default mode table into DataSystemComponents Move makeDefaultModeTable() from DataSystemBuilder into DataSystemComponents so the FDv1 fallback synchronizer can be fully package-private. Remove the public FDv1PollingSynchronizerBuilder and fdv1PollingSynchronizer() factory method to prevent external customization of the fallback path. Made-with: Cursor
...rkly-android-client-sdk/src/main/java/com/launchdarkly/sdk/android/DataSystemComponents.java
Show resolved
Hide resolved
…vives custom mode overrides The FDv1 fallback synchronizer now uses the background poll interval (60 min) when in background mode instead of always using the foreground rate (5 min). Customer mode overrides preserve the default FDv1 fallback when the custom mode has initializers or synchronizers. Consolidate DEFAULT_POLL_INTERVAL_MILLIS into LDConfig as a single source of truth. Made-with: Cursor
...id-client-sdk/src/main/java/com/launchdarkly/sdk/android/integrations/DataSystemBuilder.java
Show resolved
Hide resolved
...id-client-sdk/src/main/java/com/launchdarkly/sdk/android/integrations/DataSystemBuilder.java
Outdated
Show resolved
Hide resolved
|
Looks good to me, just that one bugbot null check needed and benchtesting results. |
Made-with: Cursor
...kly-android-client-sdk/src/main/java/com/launchdarkly/sdk/android/FDv2DataSourceBuilder.java
Show resolved
Hide resolved
...id-client-sdk/src/main/java/com/launchdarkly/sdk/android/integrations/DataSystemBuilder.java
Show resolved
Hide resolved
| // TODO: Arrays.asList(cacheInitializer) — add once implemented | ||
| Collections.<DataSourceBuilder<Initializer>>emptyList(), | ||
| Collections.singletonList(pollingSynchronizer), | ||
| fdv1FallbackPollingSynchronizerForeground |
There was a problem hiding this comment.
Shared FDv1 fallback builder instance across STREAMING and POLLING modes
Low Severity
The same fdv1FallbackPollingSynchronizerForeground builder instance is shared between the STREAMING and POLLING mode definitions. Since FDv1PollingSynchronizerBuilderImpl is mutable (it has a settable pollIntervalMillis), if any code path ever mutates this shared builder after table construction, it would affect both modes. While currently safe because the builder is not further modified, sharing a mutable builder across multiple ModeDefinition entries is fragile.
Reviewed by Cursor Bugbot for commit 3afd09a. Configure here.
There was a problem hiding this comment.
This is also by design. I think the alternative is to create two separate FDv1PollingSycnhronizer builders (and i believe only their polling durations would be different).
| resultQueue.put(FDv2SourceResult.status( | ||
| FDv2SourceResult.Status.interrupted( | ||
| new LDFailure("Failed to parse SSE event", e, | ||
| LDFailure.FailureType.INVALID_RESPONSE_BODY)))); |
There was a problem hiding this comment.
Streaming ping handler loses fallback signal on poll failure
Low Severity
In handleMessage, the early return for ping events bypasses the new isFdv1Fallback(event.getHeaders()) check. The ping handler delegates to FDv2PollingBase.doPoll(), which reads the fallback header from the poll response. However, if the poll request fails with a transport-level exception (not an HTTP error), doPoll() hardcodes fdv1Fallback = false in its InterruptedException/ExecutionException catch blocks, losing the fallback signal that was present in the SSE connection's response headers carried by event.getHeaders(). Eventually self-corrects via subsequent events or connection errors, but creates a detection gap.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit c6beb5c. Configure here.
There was a problem hiding this comment.
I don't know if the actual "ping" response will contain the fallback header. But when the SDK receives a "ping" message, it immediately polls to get data (per the spec), and the response from the poll is inspected for a fallback header. I think that's good enough.
Made-with: Cursor
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
There are 5 total unresolved issues (including 3 from previous reviews).
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 16b9f8f. Configure here.
| LDUtil.makeHttpProperties(inputs.getHttp()), | ||
| inputs.getCacheDir(), | ||
| inputs.getBaseLogger() | ||
| ); |
There was a problem hiding this comment.
FDv1 fallback polling uses wrong endpoint path
High Severity
The FDv1PollingSynchronizerBuilderImpl creates an HttpFeatureFlagFetcher using inputs.getServiceEndpoints().getPollingBaseUri() as the polling URI. However, getPollingBaseUri() returns the FDv2 polling base URI. The FDv1 fallback synchronizer needs to reach the legacy FDv1 /sdk/evalx endpoints, but HttpFeatureFlagFetcher appends FDv1 paths (StandardEndpoints.POLLING_REQUEST_GET_BASE_PATH) to whatever base URI it receives. If the FDv2 polling base URI differs from the FDv1 polling base URI (which is the case when custom service endpoints are configured separately for FDv2), the requests will be sent to the wrong host or path.
Reviewed by Cursor Bugbot for commit 16b9f8f. Configure here.
There was a problem hiding this comment.
Bugbot is confused between the base URI and the full path. The pollingBaseUri is really just the base url, and it's the same between FDv1 and FDv2. Then the correct path is added to the baseUri when the fetch() function is called.
| pollIntervalMillis, | ||
| TimeUnit.MILLISECONDS); | ||
| } | ||
| } |
There was a problem hiding this comment.
FDv1 synchronizer starts polling immediately in constructor
Medium Severity
FDv1PollingSynchronizer starts its scheduled polling task in the constructor with initialDelayMillis=0. The factory in FDv1PollingSynchronizerBuilderImpl.build() passes 0 for initialDelayMillis. Since SourceManager.getNextAvailableSynchronizerAndSetActive() calls factory.build() which constructs this synchronizer, polling begins immediately upon construction — potentially before the caller has called next() to consume results. If the first poll completes very quickly, the result goes into resultQueue which is fine, but if a terminal error occurs before next() is ever called, shutdownFuture is set and the scheduled task cancelled. Subsequent next() calls will return the terminal result, which is correct. However, this eager-start pattern means the synchronizer fires off network requests before the FDv2DataSource run loop is ready to consume them, which could cause unexpected behavior if the executor thread pool is constrained.
Reviewed by Cursor Bugbot for commit 16b9f8f. Configure here.
There was a problem hiding this comment.
This was our design choice. The FDv2 polling synchronizer behaves the same way.
|
@tanderson-ld The rest of the open Bugbot issues are either not actual issues, or they're comments about things that we did by design. This looks good to me 👍 |


The Goal
When the LaunchDarkly backend determines that a client should not be using FDv2, it signals this by including an
x-ld-fd-fallback: trueheader in HTTP responses. This PR teaches the Android SDK to detect that header and seamlessly switch from FDv2 data sources to an FDv1 polling fallback, so the SDK continues to receive flag data without interruption.JIRA ticket: https://launchdarkly.atlassian.net/browse/SDK-1829
The Approach
The implementation follows a layered approach, working from the bottom of the stack upward:
Data model — The result types that flow between components gained an
fdv1Fallbackboolean, so any component that receives a server response can carry the fallback signal forward without losing it. This mirrors the js-core approach, where the fallback signal is carried on the data source result type rather than handled as a side channel.Header detection — The polling requestor and streaming synchronizer both read the
x-ld-fd-fallbackheader and attach it to every result they produce — success, error, or otherwise. This means the signal is never silently dropped. Both js-core and java-core detect the header at the transport layer; the Android implementation follows the same pattern.Fallback synchronizer — A new standalone FDv1 polling synchronizer fetches flag data from the legacy
/sdk/evalxendpoints and produces results in the FDv2 format. It's intentionally decoupled from the existingPollingDataSource/HttpFeatureFlagFetcherstack because those are tightly coupled to FDv1'sDataSourceUpdateSink. This matches js-core's approach of implementing a dedicated FDv1 polling data source rather than reusing the existing one.Orchestration — The FDv2 data source's run loop checks for the fallback signal after processing each result. When detected, it blocks all FDv2 synchronizers, unblocks the FDv1 slot, and lets the normal synchronizer selection logic pick it up. No special "fallback mode" — it reuses the existing synchronizer cycling mechanism. The
SourceManager.fdv1Fallback()implementation follows js-core's pattern of resetting the synchronizer index to -1 so the next scan starts from the beginning of the list.Fundamental Changes
FDv2SourceResultandFDv2PayloadResponsenow carry a fallback indicator alongside their existing data. All factory methods retain backward-compatible defaults.ModeDefinitionsupports a separate list of FDv1 fallback synchronizer factories, kept distinct from regular synchronizers so they can be independently blocked/unblocked.The FDv2DataSource constructor accepts an optional list of FDv1 fallback synchronizers. It handles wrapping them as blocked entries internally, so callers just pass plain factory lists.
SourceManager.fdv1Fallback()is now a real implementation (previously a stub). It synchronizes on the active source lock to prevent races with concurrent synchronizer selection.The default mode table configures an FDv1 polling fallback for STREAMING, POLLING, and BACKGROUND modes. OFFLINE and ONE_SHOT don't have active synchronizers, so they have no fallback slot.
Some Things That Aren't In Scope
Requirements
Related issues
Provide links to any issues in this repository or elsewhere relating to this pull request.
Describe the solution you've provided
Provide a clear and concise description of what you expect to happen.
Describe alternatives you've considered
Provide a clear and concise description of any alternative solutions or features you've considered.
Additional context
Add any other context about the pull request here.
Note
Medium Risk
Introduces new cross-cutting fallback behavior in the flag update pipeline (header propagation, mode tables, and synchronizer orchestration), which could affect initialization/state transitions and update delivery if mishandled.
Overview
Adds support for automatic FDv2→FDv1 fallback when the backend returns
x-ld-fd-fallback: true, switching the active synchronizer to an FDv1 polling implementation.This threads an
fdv1Fallbacksignal throughFDv2PayloadResponse/FDv2SourceResult, teaches both polling (DefaultFDv2Requestor/FDv2PollingBase) and streaming (FDv2StreamingSynchronizer) paths to detect the header, and updatesFDv2DataSource/SourceManagerto honor it by blocking FDv2 synchronizers and activating a dedicatedFDv1PollingSynchronizer.It also extends
ModeDefinition/ResolvedModeDefinitionand the default mode table (DataSystemComponents.makeDefaultModeTable/DataSystemBuilder) to include an optional FDv1 fallback slot for STREAMING/POLLING/BACKGROUND, plumbs cache dir viaDataSourceBuildInputs, moves poll interval defaults toLDConfig, and adds unit/contract-test coverage for header detection and fallback switching.Reviewed by Cursor Bugbot for commit 16b9f8f. Bugbot is set up for automated code reviews on this repo. Configure here.