Skip to content

Conversation

@acud
Copy link
Contributor

@acud acud commented Nov 21, 2025

Checklist

  • still some code cosmetics needed
  • add return header of feed version to bzz and feeds tests, and assert outcomes in tests
  • fixed openapi spec

Description

Races both feed versions for backwards-compatibility in the bzz endpoint.
Fixes #5280

There's a couple of issues that took quite a bit of time to trace and so for posterity I will describe them here:

  • Since the new implementation does not have a version bit, a user that has an old feed and wants to migrate it can in theory sign it and upload it. This means we have two different chunks with the same address potentially stored on Swarm. This is undefined behavior and therefore users are discouraged from doing that, as it can create ambiguities as to which version gets retrieved. Since both chunks can theoretically exist in the chunk space, one request can yield one, and another request - the other.
  • When racing the two feed versions (v1 and v2), there's a problem which was demonstrated by a test that exists in the feeds endpoint (but not on the BZZ endpoint). The problem is the following:
    • Feed v1 has payload size 48 or 80 bytes, and it should resolve to a content addressed chunk
    • Feed v2 wraps the chunk, can have arbitrary size
    • Feed v2 can also just point to the content addressed chunk in the same way that v1 does
    • These assumptions were made when thinking the racing mechanism would work, BUT
    • Feed v2 can be 48 or 80 bytes that are data verbatim (point nowhere) - in which case trying to resolve it points to junk, and the request to resolve returns chunk not found
    • This means that for v2 feeds, racing the content may or may not work (as the content might already have been found as it's been wrapped)
    • This greatly implicates the business logic of trying to sort out what is "correct", but essentially the meaning is, that when there's an ambiguity between v1 and v2, we can never assume a "not found", because the content of the chunk can already be served, we just cannot say for sure that it is a problem with fetching the chunk or not, since we cannot disambiguate whether it is data or a chunk address with certainty.
    • The other implication is that with this feature of transparently racing the feed versions, I had to disable the test that tests for a feed with a non existing address to resolve to (TestFeed_Get/legacy_payload_with_non_existing_wrapped_chunk), since the outcome of the test depends on the order of resolution of the goroutines that race the chunks. If v2 gets checked first and only then v1, then the test passes (because v1 returns an error), but if v1 gets checked first and only then v2, the test fails because we can never return an error for v2, as mentioned above. I suggest to disable the test but leaving it there with a comment/reference to this PR.

Open API Spec Version Changes (if applicable)

Motivation and Context (Optional)

Related Issue (Optional)

Screenshots (if appropriate):

@acud acud force-pushed the feed-auto-fallback branch from e4c7458 to 4a65577 Compare November 24, 2025 14:53
@bcsorvasi bcsorvasi added this to the v2.7.0 milestone Dec 1, 2025
@acud acud marked this pull request as ready for review December 1, 2025 16:40
@gacevicljubisa
Copy link
Member

gacevicljubisa commented Dec 2, 2025

@acud just that you are aware, feel free to revert this #5163, which is merged on master, but not released yet. Your PR would be cleaner in that sense (because changes in your PR are on top of this). Of course, if you see this fit. This is the commit itself.

@gacevicljubisa
Copy link
Member

This looks great. However, the racing mechanism introduces some trade-offs for feed v2 payloads that happen to be 48 or 80 bytes.

Proposal: Add a dedicated v2 feed endpoint that always uses v2 resolution without any automatic racing:

  • /feeds/{owner}/{topic} - Keep current behavior (auto-resolution with racing for backwards compatibility)
  • /feeds/v2/{owner}/{topic} - New endpoint (always v2, no racing, deterministic behavior)

@acud
Copy link
Contributor Author

acud commented Dec 2, 2025

Thanks @gacevicljubisa.
With regards to v2 payloads that happen to be 48/80 bytes long - yes there's that trade-off. The thing is that with the aforementioned mechanism, v2 should, at least in theory return earlier than the v1 resolution always. It does, however, incur the cost of actually trying to request the v1 non resolvable content always (and that's part of the cost of this approach, but actually it comes from having non-versioned data-structures rather than this solution). As I wrote in the ticket pertaining to this PR - this could indeed be a problem in applications that are feed-lookup-intensive.
With regards to having versioned paths in the API, I don't have a particular objection, but:

  1. If we add /v2/ path, we should also add /v1/ for symmetry
  2. Do we need to update satellite tooling/libraries? I.e. who's going to use it and how?

I think that maybe having a header to instruct the API how to do the lookup might be a bit nicer (and easier to break in the future - API paths are more difficult). But also we could maybe wait to see how people will use this (and thus request the feature which is most desirable).

@gacevicljubisa
Copy link
Member

@acud I agree with you.

Just additional remark feeds already have both paths available:

  • /feeds/{owner}/{topic}
  • /v1/feeds/{owner}/{topic} (via rootPath prefix)

Maybe to check with Cafe137?

@acud
Copy link
Contributor Author

acud commented Dec 2, 2025

/v1/feeds/{owner}/{topic} (via rootPath prefix)

This is a prefix that applies to the entire API so I think it's out of context. For the version to version the feeds endpoint we need it to be hierarchically further "down" the path.

@acud acud force-pushed the feed-auto-fallback branch from a125395 to 909173c Compare December 2, 2025 22:47
@Cafe137
Copy link

Cafe137 commented Dec 3, 2025

  • It is a good idea to have an option to explicitly instruct Bee on which feed resolution strategy to use
  • This development is not a hard requirement for this PR -- it could be done in a separate, follow-up PR
  • I would recommend NOT using a header parameter for this. Header parameters are hard to work with in a browser, and this endpoint can be opened directly in the browser, so a header parameter would impair the accessibility of this option
  • I agree with @acud that the /v1/ API prefix is already reserved for a different purpose, so to have symmetrical endpoints we'd need something like /feeds/v1/{owner}/{topic} and /feeds/v2/{owner}/{topic} (also suggested by @gacevicljubisa in the first related message); /v1/feeds/v1/{owner}/{topic} and /v1/feeds/v2/{owner}/{topic} for the versioned API. Since it looks a bit awkward (at least to me), maybe a query parameter would be cleaner instead (similar to nugaon's prior PR)
  • These are my opinions only, happy to debate

defer cancel()
if !both {
if resolving == "v2" {
return result.ch, resolving, nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In practice we can also have here result.err != nil , right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the problem here, the answer unfortunately is "no". Have a look at the PR description - this issue is documented thoroughly. The problem is the v2 can be either a manifest address OR actual data. This means that you can't "lookup what v2 references" because it may unmarshal as junk in case it is 48/80 bytes long data verbatim which is allowed.

case result := <-other:
if !result.v1 {
// resolving v2
return result.ch, resolving, nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, in practice we can also have here result.err != nil , right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

pkg/api/bzz.go Outdated
// should fetch both feed versions. if the length isn't v1
// then we should only try to fetch v2.
var (
v1, v2 chan getWrappedResult
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For future references I think the naming here will be confusing in the future as v1 is also a bool in getWrappedResult struct and used above. Would be nice a small naming refactoring.

@acud
Copy link
Contributor Author

acud commented Dec 3, 2025

Thanks @Cafe137.
I guess that supporting specific versions in the feeds endpoint can be done. I think that it's probably better done in a later PR since it would add more complexity to the API and I rather keep this simple. Maybe we can start with this and I'll add a separate issue dealing with this issue separately. My hunch tells me though, that we might be able to get away with just the auto-resolver, but at least a bit of daily usage could suggest otherwise.

// if it returns an error we send that over the channel, otherwise
// we send the wc chunk back to the caller so that the feed can be
// dereferenced.
_, err = getter.Get(innerCtx, wc.Address())
Copy link
Contributor

@martinconic martinconic Dec 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this redundant, as in GetWrappedChunk we have a getter.Get(...) that returns nil, err if err, or wc, nil ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately no. GetWrappedChunk only calls Get on the legacy payload, not on a v2 type chunk. Also here, notice that we send the error on the channel, but also whether it is a v1 or v2 payload. Because v2 cannot technically error, we must short-circuit the error and return the payload anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Automatic Legacy Feed Fallback

6 participants