Skip to content

refactor: implement fdv2 polling initializer / synchronizer#519

Open
beekld wants to merge 27 commits intomainfrom
beeklimt/SDK-2097
Open

refactor: implement fdv2 polling initializer / synchronizer#519
beekld wants to merge 27 commits intomainfrom
beeklimt/SDK-2097

Conversation

@beekld
Copy link
Copy Markdown
Contributor

@beekld beekld commented Apr 3, 2026

This implements the C++ polling initializer and synchronizer.

The bulk of this code was generated by Claude based on the Java version, but I've gone through it line-by-line, and I think it makes sense. But I'm new to both FDv2 and ASIO, so I could be missing something.

Probably the most controversial part is the decision from the previous PR to use std::future and a blocking call. If we decide we need Java-like conditions, or if we need the callers to be non-blocking, we could change these to use asio::deferred instead. But I don't think these changes require that yet.


Note

Medium Risk
Introduces new FDv2 polling control flow and a new protocol state machine, including blocking waits and cross-thread cancellation/timer coordination, which can affect correctness and shutdown behavior under load. Networking change to AsioRequester::Request is low-risk but touches async initiation semantics used broadly.

Overview
Adds a shared FDv2ProtocolHandler state machine that consumes FDv2 wire events (server-intent, put-object, delete-object, payload-transferred, error, goodbye) and emits either a completed FDv2ChangeSet, a typed error, or Goodbye; includes extensive unit tests covering ordering, unknown kinds, and error/reset behavior.

Implements FDv2 polling for the server SDK: a request/response parser (HandleFDv2PollResponse) plus a blocking one-shot FDv2PollingInitializer and a blocking, interval-driven FDv2PollingSynchronizer that runs HTTP+timeout via ASIO parallel_group, supports Close() cancellation, enforces a minimum poll interval, and wires new sources into the build.

Refactors network::AsioRequester::Request to use boost::asio::async_initiate for completion-token based initiation, simplifying handler/lifetime handling while keeping existing InnerRequest behavior.

Reviewed by Cursor Bugbot for commit c8e736f. Bugbot is set up for automated code reviews on this repo. Configure here.

@beekld beekld requested a review from a team as a code owner April 3, 2026 23:01
std::move(**result)}};
}
// Silently skip unknown kinds for forward-compatibility.
return std::nullopt;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This ends up with maybe a slightly different layer responsibility, but I think it is likely fine. I think Java returns it, and then it is discarded a layer up, but we aren't acting on it, so it doesn't really make a difference.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I am a bit concerned that we may be losing granularity of data source reporting. Though maybe we don't have that concern at the moment.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what exactly you're asking here. The Java version seems to have a distinction between parsing and "translation" that isn't there in C++. This function doesn't return raw JSON, so there's nothing to be returned in this case.

I believe, technically, our spec says that this case should silently ignore the uknown kind, whereas the Java implementation logs a warning. I followed the spec, but I could add a warning here, if you'd prefer.

This is the only branch that returns std::nullopt, so callers could still technically distinguish this case, but I'm not sure what else we would do about it.

FDv2ProtocolHandler::Result FDv2ProtocolHandler::HandleEvent(
std::string_view event_type,
boost::json::value const& data) {
if (event_type == kServerIntent) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be nice to have this handle which event type it is in a more compact way, like a switch, and then handle the message is functions. Just makes it easier to see at a glance what is handled independent of how it is handled.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. This function is way too long. I refactored it.

tl::expected<std::optional<DeleteObject>, JsonError>>(data);
if (!result) {
Reset();
return FDv2Error{std::nullopt, "could not deserialize delete-object"};
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do need differentiation for malformed data errors, as we typically will restart on those errors versus just potentially logging another error type.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I refactored error handling to keep a type and more context. I think that should give us what we need higher up.

promise->set_value(
FDv2SourceResult{FDv2SourceResult::Timeout{}});
}
}));
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parallel group timeout cannot enforce deadline on HTTP request

Medium Severity

make_parallel_group with wait_for_one() only invokes the completion handler after all operations complete, not just the first. Since AsioRequester::Request wraps the handler in a std::function (via InnerRequest), the ASIO cancellation slot is type-erased and the HTTP request cannot be cancelled by the parallel group. When the timeout timer fires first, the promise is not set until the HTTP request also finishes on its own. This means the timeout parameter in Next() and Close() cannot promptly unblock a caller when an HTTP request is in flight, violating the IFDv2Synchronizer contract.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 01cc26b. Configure here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

omg, ASIO, why are you like this? 😭

Let me consider this.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bizarre design choice by ASIO, poorly documented, and really hard to fix.

AFAICT, this is true. wait_for_one means that when one of the tasks completes, the others are sent a cancellation signal, but the callback isn't actually called until they all stop. But it's not trivial to handle cancellation for the http request, because of the way we use Foxy http. I think we would have to add a Cancel method to FoxyClient and then hook up all the plumbing to make it get called on cancellation.

And I don't see any other reasonable way to race futures in C++17. I guess I could create a wrapper around a std::promise that enforces that only the first caller gets to complete it? Because I can't block on multiple futures without occupying more threads. std::future::then is only available in experimental concurrency extensions.

So tomorrow, I will try two things:

  1. See if Foxy supports timeouts directly, and just use that instead of treating it as a concurrency issue, or
  2. Implement a std::promise wrapper that lets two tasks race to complete it.

Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

There are 2 total unresolved issues (including 1 from previous review).

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit c8e736f. Configure here.

auto const& intent = **result;
if (intent.payloads.empty()) {
return std::monostate{};
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Empty payloads server-intent doesn't reset accumulated state

Medium Severity

When intent.payloads is empty, HandleServerIntent returns std::monostate{} without clearing changes_ or resetting state_. If the handler is mid-transfer (kFull or kPartial), a server-intent with empty payloads leaves stale state intact. Subsequent put/delete events continue accumulating into the old transfer, and payload-transferred produces a changeset mixing old and new data. Other unexpected intent branches (e.g., kNone/kUnknown) correctly reset state_ and clear changes_, making this path inconsistent. The class doc says it is shared between polling and streaming synchronizers.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit c8e736f. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants