Skip to content

fix(server): align cache event actions in legacy EventHub path#3017

Open
dpol1 wants to merge 3 commits intoapache:masterfrom
dpol1:fix/3012-eventhub-cache-actions
Open

fix(server): align cache event actions in legacy EventHub path#3017
dpol1 wants to merge 3 commits intoapache:masterfrom
dpol1:fix/3012-eventhub-cache-actions

Conversation

@dpol1
Copy link
Copy Markdown
Contributor

@dpol1 dpol1 commented May 8, 2026

Purpose of the PR

Cache producers in the legacy EventHub path were emitting past-tense action
names (invalided / cleared) while local cache listeners only matched the
present-tense actions (invalid / clear). On single-node deployments, or
whenever the cache RPC bridge in StandardHugeGraph.AbstractCacheNotifier was
not active, those local cache invalidation events were silently dropped: no
error, no warning, just stale cache.

The fix aligns producers, listeners, and the bridge on the present-tense
constants, removes the obsolete past-tense ones, and wires notifyExcept(...)
so the bridge does not loop back into itself once both ends speak the same
action name.

Main Changes

  • Drop Cache.ACTION_INVALIDED / Cache.ACTION_CLEARED. All producers
    (CachedGraphTransaction, CachedSchemaTransaction) now emit
    Cache.ACTION_INVALID / Cache.ACTION_CLEAR, matching what local listeners
    and RaftContext already expect.
  • Update StandardHugeGraph.AbstractCacheNotifier to listen on the
    present-tense actions and to forward through EventHub.notifyExcept(...) so
    the bridge does not re-receive its own re-emit.
  • Add EventHub.notifyExcept(event, ignoredListener, args...) so a caller can
    notify everyone on the hub except the listener that originated the event.
    The signature is documented as a public API on EventHub, notify(...)
    is unchanged and now delegates internally.
  • Use notifyExcept(...) from the cache transactions and from
    AbstractCacheNotifier, with the registered cache listener as the ignored
    one, so producers do not feed their own listener and the RPC re-emit does
    not loop the bridge.
  • Track the registered cache listener per graph in
    GraphTransaction.graphCacheEventListeners and in a matching map inside
    CachedSchemaTransaction, with owner-aware teardown on close. This is the
    minimum needed for notifyExcept(...) to exclude the listener that is
    actually registered on the hub: with the old per-instance lambda +
    containsListener(...) / boolean guard pattern, only the first transaction
    per graph registered a listener, while later transactions could call
    notifyExcept(...) with a lambda that was never registered. In that case the
    real registered listener was not excluded and could still process the
    transaction's own local emit.

CachedSchemaTransactionV2 is intentionally not touched here. That class was
just adjusted in #3011 for HStore/V2 schema cache propagation through
MetaManager. The remaining containsListener(...) guard there is a related
listener-lifecycle concern, not the action-name mismatch fixed in this PR; it
should be handled in a follow-up that audits the remaining V2 path and, if
needed, moves cache listener ownership to a ref-counted holder keyed by graph
identity and EventHub.

Verifying these changes

  • Trivial rework / code cleanup without any test coverage. (No Need)
  • Already covered by existing tests, such as (please modify tests here).
  • Need tests and can be verified as follows:
    • EventHubTest#testNotifyExcept covers the new notifyExcept(...)
      semantics, including ANY_EVENT listeners and the ignored-listener
      exclusion.
    • CachedSchemaTransactionTest#testLegacySchemaChangeEmitsActionInvalid
      asserts the schema producer now emits ACTION_INVALID end-to-end.
    • CachedSchemaTransactionTest#testCacheListenerSurvivesOwnerClose
      asserts that closing the owner transaction does not drop the hub
      listener while a second transaction is still alive, and that cache
      invalidation events are still delivered to the surviving transaction.
    • CachedSchemaTransactionTest#testLastCloseRemovesSchemaCacheListener
      asserts that the hub listener is unregistered and the registry entry
      removed only when the last transaction for that graph closes, and that
      a subsequent graph open registers a fresh holder.
    • CachedGraphTransactionTest#testCacheListenerSurvivesOwnerClose
      same owner-first-close guarantee for the graph vertex/edge cache path.
    • CachedGraphTransactionTest#testLastCloseRemovesGraphCacheListener
      same last-close cleanup guarantee for the graph cache path.
    • CachedSchemaTransactionTest#testCacheNotifierLocalEventCallsProxyOnce
      and testCacheNotifierRpcInvalidDoesNotLoopToProxy cover the bridge
      no-loop behavior.
    • CachedGraphTransactionTest#testClearCacheEmitsActionClear asserts
      graph cache clears now emit ACTION_CLEAR.
    • CachedGraphTransactionTest#testVertexMutationEmitsActionInvalid
      asserts graph vertex mutations now emit ACTION_INVALID.
    • CachedGraphTransactionTest#testClosingNonOwnerKeepsGraphCacheListenerRegistered
      asserts that closing a graph transaction which reused an existing
      per-graph listener does not unregister the shared graph cache listener.

Does this PR potentially affect the following parts?

  • Dependencies
  • Modify configurations
  • The public API
  • Other affects
  • Nope

EventHub.notifyExcept(...) is a new public method; existing notify(...)
behavior is unchanged.

Documentation Status

  • Doc - TODO
  • Doc - Done
  • Doc - No Need

- Align legacy cache invalidation producers and listeners on ACTION_INVALID and
  ACTION_CLEAR, removing the obsolete ACTION_INVALIDED/ACTION_CLEARED constants.
- Add EventHub.notifyExcept(...) so cache transactions and the cache notifier
  bridge can avoid re-processing their own local listener while still delivering
  events to other listeners.
- Track registered graph/schema cache listeners per graph so notifyExcept(...)
  uses the listener instance actually registered on the EventHub, including
  multi-transaction cases where later transactions reuse the first listener.
- Update cache notifier forwarding to prevent local RPC bridge loops after action
  names are unified.
- Add regression coverage for notifyExcept semantics, graph/schema action names,
  listener teardown/re-registration, and notifier no-loop behavior.
@dosubot dosubot Bot added size:L This PR changes 100-499 lines, ignoring generated files. bug Something isn't working labels May 8, 2026
@dpol1 dpol1 changed the title fix(server): align legacy cache event actions fix(server): align cache event actions in legacy EventHub path May 8, 2026
@dpol1
Copy link
Copy Markdown
Contributor Author

dpol1 commented May 8, 2026

@imbajin I took the liberty to quickly fix a flaky test which failed a workflow twice - if you think is out of the scope with this PR I' ll revert it - Actually, I just noticed the first commit is all greens

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a legacy cache invalidation bug where cache event producers emitted past-tense action names (invalided/cleared) that didn’t match the actions local listeners (and some other paths) were expecting (invalid/clear), causing silent drops and stale caches in certain deployments. It also introduces EventHub.notifyExcept(...) to prevent cache-notification bridges from re-consuming their own re-emitted events.

Changes:

  • Remove legacy Cache.ACTION_INVALIDED / Cache.ACTION_CLEARED and align all cache event producers/listeners to Cache.ACTION_INVALID / Cache.ACTION_CLEAR.
  • Add EventHub.notifyExcept(event, ignoredListener, args...) and route notify(...) through the same internal implementation.
  • Track per-graph cache listeners in static registries and use notifyExcept(...) to avoid self-delivery/bridge loops; add/adjust unit tests for the new semantics.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
hugegraph-server/hugegraph-core/src/main/java/org/apache/hugegraph/backend/cache/Cache.java Removes past-tense cache action constants to enforce a single action vocabulary.
hugegraph-commons/hugegraph-common/src/main/java/org/apache/hugegraph/event/EventHub.java Adds notifyExcept(...) and implements ignored-listener filtering in the notifier loop.
hugegraph-commons/hugegraph-common/src/test/java/org/apache/hugegraph/unit/event/EventHubTest.java Adds coverage for notifyExcept(...) (including ANY_EVENT listeners).
hugegraph-server/hugegraph-core/src/main/java/org/apache/hugegraph/StandardHugeGraph.java Updates legacy notifier bridge to present-tense actions and uses notifyExcept(...) to avoid loops.
hugegraph-server/hugegraph-core/src/main/java/org/apache/hugegraph/backend/tx/GraphTransaction.java Replaces the boolean guard with a per-graph cache listener registry map.
hugegraph-server/hugegraph-core/src/main/java/org/apache/hugegraph/backend/cache/CachedSchemaTransaction.java Emits present-tense actions and uses per-graph listener registry + notifyExcept(...) on emits.
hugegraph-server/hugegraph-core/src/main/java/org/apache/hugegraph/backend/cache/CachedGraphTransaction.java Emits present-tense actions and uses per-graph listener registry + notifyExcept(...) on emits.
hugegraph-server/hugegraph-test/src/main/java/org/apache/hugegraph/unit/cache/CachedSchemaTransactionTest.java Adds tests validating action name alignment, listener registry lifecycle, and bridge no-loop behavior.
hugegraph-server/hugegraph-test/src/main/java/org/apache/hugegraph/unit/cache/CachedGraphTransactionTest.java Adds tests validating action name alignment and “non-owner close” listener behavior.
hugegraph-server/hugegraph-test/src/main/java/org/apache/hugegraph/unit/cache/CacheManagerTest.java Speeds up expiration tick test via reflective rescheduling and cancels the ad-hoc task.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown
Member

@imbajin imbajin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix looks correct and the new tests are solid. Two concerns worth tracking: (1) storeEventListenStatus has the same ownership bug this PR fixes for the cache listener, acknowledged in the test's finally block — worth a follow-up; (2) the owner-first-close scenario for the cache listener deserves an explicit test and/or a ref-counted ownership model.

@dosubot dosubot Bot added size:XL This PR changes 500-999 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels May 9, 2026
@dpol1 dpol1 requested a review from imbajin May 9, 2026 10:36
- Replace the per-transaction cache listener owner flag with a per-graph
  CacheListenerHolder shared by graph and schema cache transactions.

- The holder keeps the EventHub listener registered while any transaction for the
  graph is alive, and unregisters/removes it only when the last transaction
  releases it. The registry update, ref-count decrement, and hub unlisten now run
  inside ConcurrentMap.compute() to avoid owner-closes-first invalidation gaps.

  Also add graph/schema regression coverage for owner-first close and last-close
  cleanup, including graph close/reopen handling for stale EventHub holders.
@dpol1 dpol1 force-pushed the fix/3012-eventhub-cache-actions branch from b77fb66 to 80bd29c Compare May 9, 2026 10:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working size:XL This PR changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] EventHub action name mismatch silently drops cache invalidation events in legacy path

3 participants