[RFC] Move from centralized discovery to entity-driven discovery #615

puddly · 2026-01-06T19:51:56Z

As part of a medium-term goal of getting rid of cluster handlers and eventually rewriting entity classes themselves to request attribute reporting config/binding/attribute values, I think a first stepping stone would be to move away from implicit entity registration decorators and have entity objects themselves decide what cluster handlers they want (or if they are not applicable).

This is first draft and has not been runtime tested but I have most device entity tests passing. The few that do not may actually be current bugs with ZHA that are accidentally fixed by this PR.

Basically, instead of this:

@STRICT_MATCH(
    cluster_handler_names=CLUSTER_HANDLER_ON_OFF,
    aux_cluster_handlers={CLUSTER_HANDLER_COLOR, CLUSTER_HANDLER_LEVEL},
)
class Light(BaseClusterHandlerLight, PlatformEntity):
    ...

We do this:

@register_entity
class Light(BaseClusterHandlerLight, PlatformEntity):
    ...
    @classmethod
    def match_cluster_handlers(cls, endpoint: Endpoint) -> ClusterHandlerMatch | None:
        """Match cluster handlers for this entity."""
        # Only create an on/off light if the device type is correct
        if (
            endpoint.zigpy_endpoint.profile_id,
            endpoint.zigpy_endpoint.device_type,
        ) not in {
            # ZHA
            (zha.PROFILE_ID, zha.DeviceType.COLOR_DIMMABLE_LIGHT),
            ...
        }:
            return None

        return ClusterHandlerMatch(
            cluster_handlers=frozenset({CLUSTER_HANDLER_ON_OFF}),
            optional_cluster_handlers=frozenset(
                {CLUSTER_HANDLER_COLOR, CLUSTER_HANDLER_LEVEL}
            ),
            legacy_discovery_unique_id=f"{endpoint.device.ieee}-{endpoint.id}",
        )

There is no more complexity in ClusterHandlerMatch. It lists what cluster handlers are required, what cluster handlers are optional, and what unique ID format will be used for the entity. There are no stop groups or anything else, entity classes are expected to be explicitly coordinated to be mutually exclusive.

This has a few benefits:

It moves unique ID calculation logic into the entity itself and makes it 100% explicit, allowing us to more easily migrate.
Allows us to fine-tune cluster handler matching per-entity without adding more matching rules or even relying on a weighting system at all.
Will allow us to move cluster handler ZCL attribute reporting and binding config into the ZCL entity objects themselves, allowing ZHA to granularly merge binding/reporting config when setting up a device.

This reverts commit 8b70a71.

Fixes Add a TODO WIP

This reverts commit 80f134d.

… type

codecov · 2026-01-21T23:35:31Z

Codecov Report

❌ Patch coverage is 99.26380% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 97.13%. Comparing base (c6ac89e) to head (ae26b14).

Files with missing lines	Patch %	Lines
zha/application/discovery.py	96.94%	4 Missing ⚠️
zha/application/platforms/cover/__init__.py	93.54%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##              dev     #615      +/-   ##
==========================================
+ Coverage   97.02%   97.13%   +0.11%     
==========================================
  Files          63       62       -1     
  Lines       10573    10671      +98     
==========================================
+ Hits        10258    10365     +107     
+ Misses        315      306       -9

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

This reverts commit f475cc2.

TheJulianJES · 2026-01-22T01:28:07Z

tests/test_button.py

-    # Suppress normal endpoint probing, as this will claim the Opple cluster handler
-    # already due to it being in the "CLUSTER_HANDLER_ONLY_CLUSTERS" registry.
-    # We want to test the handler also gets claimed via quirks v2 attributes init.
-    with patch("zha.application.discovery.EndpointProbe.discover_entities"):
-        zha_device = await join_zigpy_device(zha_gateway, zigpy_device)


The test would already always pass if this patch is removed before. It no longer tests that the quirks v2 entities claim the OppleRemoteClusterHandler then. The default ZHA entities will always claim that, hence discovery for those was patched out.

I'll need to have a proper look at this later, but I think these tests essentially test nothing (correctly) now.

The entire unit test could actually be removed, I think. This PR removes the concept of claiming cluster handlers: we claim them to tell ZHA to "use" them, nothing more.

Ok, I'll still need to take a look at this. Previously, claiming them determined setting up attribute reporting and if they're bound to the coordinator or not.

zha/application/platforms/light/__init__.py

TheJulianJES · 2026-01-22T02:05:37Z

zha/application/platforms/light/__init__.py

+        # No collision with `HueLight`
+        if endpoint.device.manufacturer in {"Philips", "Signify Netherlands B.V."}:
+            return None
+
+        # Or with `MinTransitionLight`
+        if endpoint.device.manufacturer in DEFAULT_MIN_TRANSITION_MANUFACTURERS:
+            return None
+
+        # Or with `ForceOnLight`
+        if endpoint.device.manufacturer in {
+            "Jasco",
+            "Jasco Products",
+            "Quotra-Vision",
+            "eWeLight",
+            "eWeLink",
+        }:
+            return None


This doesn't scale really well, as we have to keep repeating ourselves if we add another entity that's more specific compared to the base one, with the intention of replacing the base one.

The discovery system we had before with the weighting and stop_on_match_group was a bit awkward, but I think it's closer to the new discovery schemas used in Matter (and slowly in Z-Wave).

For reference, see Matter's discovery here and an example of a specific binary sensor replacement here. (The Matter key is used as a unique ID suffix.)
By default, a single Matter entity claims an attribute, adding that to discovered_attributes.
Unless allow_multi=True is set, a second entity will not be discovered for that attribute.

As far as I can see, the Hue sensor replacement only works, as it's first in the list (before the generic/fallback one) (and the default is allow_multi=False). I do think that's not a perfect solution and I'd have switched the default the other way around, as that can be easier seen in tests, but our weighting system sorting the discovered entities based on specificity was actually a nice solution for that IMO.

Maybe we should talk about this a bit. What do you think of introducing something similar to allow_multi but in the opposite direction, e.g. exclusive_claim? This would only be set on the more specific entities (e.g. vendor-specific Hue sensor replacing generic one for the Matter example, or, in our case, ForceOnLight replacing the generic one, for example).

If we need to match in a much more specific way, we can simply do it manually, like done here, but in most cases, we should be able to set exclusive_claim=True, only for the more specific entities.
The more specific entities would automatically get higher priority compared to ones that do not have exclusive_claim set, so these entities would be discovered first. For the exclusive_claim=True entities, the primary attribute (or a key, same for both entities) would be added to a list/set of "claimed attributes/keys".
Later, all normal entities with exclusive_claim=False check if their attribute/key is in in the claimed attributes/keys. If not, they're created normally. But if it is, they would skip creation.

This would be a much more simple system than our previous weighting system, and we should be able to keep all benefits you mentioned in the PR description, at least from a quick look, but also simplify and clean this up a lot IMO.

I completely agree! The PR in its current state is still very much a WIP (I haven't actually run it) and the entity-based discovery needs to be completely rewritten; I was curious to see how far I can get with making all of the matching logic painfully verbose and entirely isolated to individual entities.

In this branch, 99% of all entities are simply created based on the presence of a cluster with no further filtering. The 1% of exceptions are now very explicit:

The OnOff server cluster can be turned into a Light or a Switch based on the endpoint device type.

The OnOff server cluster can also be turned into a Shade if Level and Shade clusters also exist and the device type is correct.

The Fan server cluster is not created if a Thermostat entity is currently claiming it, to prevent creating a duplicate entity for fan functionality that's currently part of the complex thermostat entity.

I think that's actually all of them. If we migrate all of the device-specific quirks out of the ZHA codebase and into quirks, the ZCL-only functionality becomes super straightforward. Something like:

from zha.platforms.climate import BaseThermostat class MyWeirdThermostat(BaseThermostat): def __init__(self, device) -> None: self._device = device async def async_turn_on(self) -> None: await self._device.send_packet(...) ( QuirkBuilder() .applies_to("manuf", "model") # Prevent ZCL entity auto-discovery .prevent_default_entity_creation(endpoint_id=1) # And substitute our own .exposes_entity(MyWeirdThermostat) .add_to_registry() )

What do you think of introducing something similar to allow_multi but in the opposite direction, e.g. exclusive_claim?

Having an entity prevent future entities from doing something is an API that I'd like to avoid. Once you move the device-specific "quirks" out of core ZHA, we may not even need claiming/exclusivity as part of the entity matching API. We could eventually use static discovery schemas:

class EntityDiscovery: # Mandatory clusters server_clusters: frozenset[type[Cluster]] client_clusters: frozenset[type[Cluster]] # Missing clusters not_server_clusters: frozenset[type[Cluster]] not_client_clusters: frozenset[type[Cluster]] # A set of (profile, device_type)s device_types: frozenset[tuple[int, int]] # Some way to specify attributes to peek at?? # Optional clusters will be pulled from the endpoint during initialization

I've added a feature group weighting system nearly identical to the above. It slightly increases discovery complexity but removes the need for all filtering logic in match_cluster_handlers().

At this point, all but a few def match_cluster_handlers functions are entirely static. The only dynamic logic is to determine the unique_id format based on the endpoint profile and device type, which I think we can get rid of.

…dler_match`

puddly · 2026-01-23T03:22:18Z

I'm successfully running this branch locally, as a final test. No entity issues to report.

puddly added 30 commits January 5, 2026 14:41

Move all discovery functions to module level

ee979d0

WIP

3c4a198

WIP

ce7eaa3

WIP

3a89a2a

WIP: unique ID compatibility

c531a87

WIP: Delete registries

afe7824

WIP: more unique_id fixes

2697397

WIP: cleanup

7b3bde1

WIP: Fix OnOff output cluster exceptions

7b20a97

WIP: Minimize entity diff

52c7734

WIP: more OnOff unique ID fixes

4f23c9c

WIP: More light fixes

81b1a51

WIP: More fan fixes

58b9545

Fix device tracker tests

b740670

Remove unnecessary registry tests

3cfa730

Remove broken ThermostatHVACAction entity

8b70a71

Revert "Remove broken ThermostatHVACAction entity"

58bb951

This reverts commit 8b70a71.

Ignore Keen vent when generating switch entities

4ebfe76

Fix Keen vent unique ID

76b0780

Explicit value_attribute`

80f134d

Fixes Add a TODO WIP

Revert "Explicit value_attribute`"

426658b

This reverts commit 80f134d.

Fixes

4fbdae2

Regenerate diagnostics

aae7073

Avoid private imports from zigpy

eac071f

Drop weights

eed902c

Re-add model/manufacturer checks

224defd

Drop unnecessary if model/manufacturer checks

141951c

Fixes

bacc170

Fixes

4185fba

FIXME: Use a separate cluster handler for IAS ACE client clusters

056ad85

puddly and others added 13 commits January 21, 2026 00:41

Merge branch 'dev' into puddly/discovery-cleanup

e192c1b

Drop unnecessary server cluster handler

2d82c83

WIP: fix existing tests

7cac678

Re-implement device overrides

cbefc44

Fix failing alarm control panel test

f0c6eea

Remove unnecessary endpoint_discover_entities

3aad5ce

Fix IAS ACE cluster handler type

6b5c9d9

Exclude old coordinators from entity creation

292ecb9

Regenerate diagnostics to account for changed IAS ACE cluster handler…

cfe0986

… type

Merge matching logic into discover_entities_for_endpoint

efd1ffa

Implement group discovery and get all tests passing

45edbbb

Key ENTITY_REGISTRY lookups by cluster_id for performance

38b3e04

Apply pre-commit auto fixes

478a593

puddly added 4 commits January 21, 2026 18:41

Pre-commit...

5f96ebb

Simplify group discovery

74f92a6

Make match_cluster_handlers abstract

f475cc2

Revert "Make match_cluster_handlers abstract"

6930de6

This reverts commit f475cc2.

TheJulianJES reviewed Jan 22, 2026

View reviewed changes

puddly added 10 commits January 22, 2026 12:39

Add another TODO

5c6383c

Implement simpler entity filters via feature weight system

3406ec7

Account for platform_override

b27e6dc

Aggregate matches across all clusters

efac057

Migrate climate/fan to new system

57838b3

Remove dynamic logic from device tracker platform

981a4fe

Drop platform_override from match_cluster_handlers args

b60f0a7

Remove Endpoint.unclaimed_cluster_handlers, it is unused

1d45081

Compute legacy_discovery_unique_id within the entity class itself

1622466

Migrate from dynamic match_cluster_handlers to static `_cluster_han…

ae26b14

…dler_match`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] Move from centralized discovery to entity-driven discovery #615

[RFC] Move from centralized discovery to entity-driven discovery #615

Uh oh!

puddly commented Jan 6, 2026 •

edited

Loading

Uh oh!

codecov bot commented Jan 21, 2026 •

edited

Loading

Uh oh!

TheJulianJES Jan 22, 2026

Uh oh!

puddly Jan 22, 2026

Uh oh!

TheJulianJES Jan 22, 2026

Uh oh!

Uh oh!

TheJulianJES Jan 22, 2026 •

edited

Loading

Uh oh!

puddly Jan 22, 2026 •

edited

Loading

Uh oh!

puddly Jan 22, 2026

Uh oh!

puddly commented Jan 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[RFC] Move from centralized discovery to entity-driven discovery #615

Are you sure you want to change the base?

[RFC] Move from centralized discovery to entity-driven discovery #615

Uh oh!

Conversation

puddly commented Jan 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Jan 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

TheJulianJES Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

puddly Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

TheJulianJES Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

TheJulianJES Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

puddly Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

puddly Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

puddly commented Jan 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

puddly commented Jan 6, 2026 •

edited

Loading

codecov bot commented Jan 21, 2026 •

edited

Loading

TheJulianJES Jan 22, 2026 •

edited

Loading

puddly Jan 22, 2026 •

edited

Loading