Skip to content

404 Add branching support#476

Open
arthanson wants to merge 79 commits into
featurefrom
404-branching
Open

404 Add branching support#476
arthanson wants to merge 79 commits into
featurefrom
404-branching

Conversation

@arthanson
Copy link
Copy Markdown
Contributor

@arthanson arthanson commented Apr 20, 2026

Fixes: #404

Adds native NetBox Branching support to the netbox-custom-objects plugin. Custom object types, fields, and instances can now be created, modified, and deleted inside a branch and then merged, synced, or reverted like any other model - the previous exempt_models workaround in PLUGINS_CONFIG and the branch-warning UI banner are no longer required.

Note: A separate PR will be added to add in a check framework to check for minimum versions of NetBox and branching if branching is installed. This isn't added here as it would break any tests running until those versions are released.

Key Changes

Schema-aware DDL routing

  • New _get_schema_connection() helper routes schema-editor operations (add/remove/alter column, create/drop tables) to the active branch's PostgreSQL schema instead of always targeting main.

Branch-aware model cache

  • CustomObjectType._model_cache is now keyed by (cot_id, branch_id) so main and each branch get their own generated class. Only main's class is registered in Django's app registry, so content_type.model_class() consistently resolves to a class matching main's schema even when called inside an active-branch context.

Deserialization hooks for branch merge/revert

  • deserialize_object classmethods on CustomObject, CustomObjectType, and CustomObjectTypeField so netbox-branching's ObjectChange.apply() runs the full save lifecycle (model generation, schema editor, signals) instead of bypassing it via save_base(raw=True).

Squash-merge ordering fix

  • ContextVar-keyed deferred CO field data: when a squash merge applies a CO CREATE before the field's CREATE, values are stashed and applied via raw UPDATE after each column is added.
  • Pre/post merge/sync/revert signals reset the ContextVar at every operation boundary so leftover state from a failed prior op cannot leak forward.

M2M change visibility

  • Custom add/remove/clear/set on the M2M descriptor now emit m2m_changed, and the through model is flagged auto_created so Django's serializer includes M2M values in ObjectChange.postchange_data. Without these, a merge or sync replays zero through-table rows for CustomObject M2M fields.

Conflict resolution

  • _schema_alter_field detects divergent renames (same field PK renamed differently in main and branch), looks up the live column name in the target schema, and renames it to converge on the merge target.
  • _rename_objectchange_field_key updates JSON keys in core_objectchange and netbox_branching_changediff so historical audit data stays consistent post-rename.

Polymorphic fields

  • Polymorphic OBJECT/MULTIOBJECT now serialize a POLY_M2M_SIDECAR_KEY sidecar (field name + PK) so squash's dependency graph can order field CREATE before CO CREATE; their content_type references use db_constraint=False (plain PositiveBigIntegerField on the through) to avoid cross-connection django_content_type lock contention; PolymorphicManyToManyManager fires m2m_changed so change-logging captures values, and deserialize_object replays the sidecar via raw INSERT into a per-(cot, branch) through model.

Branching extension hooks

  • PluginConfig.serializer_resolver resolves dynamically-generated CO serializers (which have no importable path under the conventional {app_label}.api.serializers.{Model}Serializer).
  • branching.objectchange_field_migrator (registered via register_objectchange_field_migrator) translates stale field-name keys in stored ObjectChange data so replays survive field renames. Shares CustomObject.resolve_field_aliases with deserialize_object for CREATE replay.
  • branching.supports_branching_resolver (registered via register_branching_resolver) marks dynamically-generated CO M2M through-models as branchable so their queries are routed to the active branch's connection.
  • branching.add_custom_object_dependencies (connected to squash_dependency_graph_built signal) extends squash's dependency graph with CO-specific edges (local M2M target lists, CustomObjectTypeField PKs from _field_objects, and polymorphic POLY_M2M_SIDECAR_KEY entries) so squash applies field CREATEs before CO CREATEs and orders self-referential M2M targets correctly.

Other

  • Datetime filters now make naive values timezone-aware before comparison.
  • cache_timestamp is excluded from ObjectChange serialization.
  • snapshot() is called before saves that bump cache_timestamp so change logging records a correct pre-state.
  • pre_delete handler reconnect moved into a try/finally flow so it's always restored.
  • The previous is_in_branch API/UI gates and branch-bypass warning banners are removed.

Tests

New tests/test_branching.py (~1,500 lines) covering:

  • Simple and all-field-types merge + revert
  • Object create/modify/delete in branch
  • Field rename, unique toggle, non-schema attribute changes
  • Sync (main→branch) with/without concurrent edits
  • Concurrent field renames (conflict resolution)
  • Sequential A→B→C rename chains with CO updates at each step

Merge/revert and rename-chain tests run under both iterative and squash strategies via parameterized base classes (BaseBranchingTests, BaseConcurrentEditMergeTests, SequentialRenameTestCase / SequentialRenameSquashTestCase).

Test infrastructure additions in tests/base.py: stale-model purge, ContentType recreation, and ObjectChange cleanup between TransactionTestCase runs.

@netboxlabs netboxlabs deleted a comment from claude Bot Apr 21, 2026
@netboxlabs netboxlabs deleted a comment from github-actions Bot May 22, 2026
@netboxlabs netboxlabs deleted a comment from github-actions Bot May 22, 2026
@arthanson
Copy link
Copy Markdown
Contributor Author

@claude /review

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 22, 2026

Claude finished @arthanson's task in 3m 42s —— View job


PR Review: 404 Add branching support

Tasks:

  • Gather context and understand PR scope
  • Review core architecture changes (models.py, __init__.py)
  • Review branching module (branching.py)
  • Review field types changes (field_types.py)
  • Review API changes
  • Review test coverage (test_branching.py, base.py)
  • Review template changes
  • Compile and post final review

Overview

This is a large, well-scoped PR that implements native netbox-branching support for dynamically-generated custom object types. The core challenge — making runtime Django models coexist with branching's multi-schema PostgreSQL architecture — is solved with several interlocking mechanisms. The implementation is thoughtful and well-documented. Below is a detailed breakdown.


Architecture Assessment

Schema-aware DDL routing (_get_schema_connection) — Clean and correct. A single helper that returns the branch connection when in a branch context, falling back to the default connection. Consistently used across all DDL paths (create_model, add_field, remove_field, alter_field, FK constraint creation). Good centralisation.

Branch-aware model cache — Keying _model_cache on (cot_id, branch_id) and only registering main's class in apps.all_models is the right call. It preserves the invariant that content_type.model_class() always returns a class whose column set matches the main schema, even when called inside a branch context. _restore_main_through_registration correctly re-stamps the canonical through model after Django's metaclass auto-registers a branch-side one.

_deferred_co_field_data ContextVar for squash ordering — A ContextVar (not a class-level dict) is the right choice here: it isolates concurrent merges in different threads/coroutines. The reset at every merge/sync/revert boundary (pre- and post-signals) is an important defence against left-over state from a failed operation.

deserialize_object hooks — Correct pattern for bypassing save_base(raw=True) during branch merge/revert. The three implementations (CustomObject, CustomObjectType, CustomObjectTypeField) all call the real save() lifecycle, ensuring model creation/column addition/FK constraint wiring runs during replay.


Issues

1. TOCTOU race in get_model() cache check (models.py:1432-1448)

The outer lock is acquired, the cache is checked, and then released before the model is generated and re-cached. A second thread entering get_model() for the same (cot_id, branch_id) after the lock is released will see is_model_cached() = False (since clear_model_cache() was just called on line 1447) and regenerate the model in parallel. This is safe in the sense that both threads produce equivalent classes, but it means two concurrent threads can both go through the expensive generate_model() + apps.register_model() path. Not a correctness bug, but worth documenting or noting that the lock design is "optimistic re-generation is acceptable."

2. Deviation from Django's m2m_changed contract for post_clear (field_types.py)

Both MultiObjectFieldType._M2MDescriptor.clear() and PolymorphicManyToManyManager.clear() fire post_clear with pk_set=existing instead of pk_set=None. The comment acknowledges this and justifies it (netbox-branching needs the cleared PKs). This is reasonable for the use case, but any third-party code or future Django version that receives the m2m_changed signal for post_clear and checks if pk_set is None: … to detect a "full clear" will get unexpected behaviour. Fix this → — at minimum, document the deviation in the class docstring rather than only in inline comments.

3. _LookupError on active_branch.get() is unnecessary (models.py:1052-1054)

try:
    branch = active_branch.get()
except LookupError:
    return None

ContextVar.get() with a default never raises LookupError; only the no-default form raises it. active_branch from netbox-branching always has a default (None). This silent swallow could mask a future upstream change where active_branch stops having a default. Fix this →

4. _taggable_manager_patch_lock serialises all model generation (models.py:1496-1514)

The lock around TM.post_through_setup is correct for correctness (class-level global state mutation must be serialised). However, it serialises all concurrent model generations globally — including unrelated COT generations in different branch contexts. On a busy server restarting with many COTs, or during a squash merge generating many models, this becomes a serialisation bottleneck. Consider whether taggit's class-level state mutation needs a process-wide lock, or if a narrower scope could work.

5. Temporary models in _rename_or_create_m2m_through registered in apps.all_models (models.py:344-351)

old_through_model = generate_model(
    f'_TempOldThrough_{old_through}',
    (models.Model,),
    {...},
)
schema_editor.alter_db_table(old_through_model, old_through, new_through)

generate_model() registers the class in apps.all_models. These _TempOldThrough_* classes are never cleaned up afterwards. Across many renames, this leaks model registrations. Either unregister after use or bypass the app registry for temporary models. Fix this →

6. _apply_poly_m2m_rows catches all ValueError/ContentType.DoesNotExist silently (models.py:116-122)

When a content_type label can't be split or the CT doesn't exist, the row is skipped with a logger.debug and no return value. This means a polymorphic M2M replay that has all rows orphaned (e.g. the branch's CT got stale) silently succeeds — no rows inserted, no error. This could leave a CO in an inconsistent state after merge without any visible signal. At least a logger.warning would help diagnose replay gaps.

7. CustomObjectType.delete() queries self.get_model() after clear_model_cache(all_branches=True) (models.py:1722-1725)

self.clear_model_cache(self.id, all_branches=True)
model = self.get_model()

clear_model_cache(all_branches=True) evicts all entries for this COT, then get_model() immediately regenerates. This is consistent, but the comment could explain why eviction-then-regenerate is needed here (answer: to get a fresh model with the current branch context's column set for the subsequent DDL drop).


Minor Notes

  • _build_rename_map is efficient — one batched ObjectChange query for all unknown attrs, then one cot.fields query for the resolved PKs. Good.
  • filtersets.py datetime fix — the make_aware(parsed) for naive datetime values in the search filter is a genuine correctness fix for timezone-aware databases.
  • _SchemaAwareDeserialized inner classes — defining a new class on every deserialize_object call is a minor inefficiency. Could be factored out as a module-level class. Not blocking.
  • snapshot() before cache_timestamp bump — correctly ensures change logging records a proper pre-state before the auto_now update.
  • serialize_object excludes cache_timestamp — correct; this internal-only field has no business in ObjectChange records.
  • Branch bypass warning templates deleted — correct cleanup; the workaround is now unnecessary.

Test Coverage

test_branching.py (~2,793 lines) is comprehensive:

  • Both merge strategies (iterative + squash) via parameterised base classes
  • All field types in a single merge/revert cycle
  • Field rename, unique toggle, non-schema attribute changes
  • Sync with concurrent edits (conflict resolution)
  • Sequential A→B→C rename chains with CO data at each step
  • Polymorphic M2M replay

One concern: the _provision_branch polling loop uses time.sleep(0.1) with a 30-second timeout. This could be flaky in slow CI workers. The NETBOX_CO_BRANCH_PROVISION_TIMEOUT env var is a good mitigation. The test file also correctly guards all classes with @unittest.skipUnless(HAS_BRANCHING, ...) so the suite stays clean without branching installed.

tests/base.py additions (_purge_stale_generated_models, _drop_dynamic_tables, _reset_netbox_request_context) are well-reasoned: they address the specific interaction between TransactionTestCase database flushes and the plugin's in-process app registry state.


Summary

This is a high-quality, well-architected PR. The problems it solves (DDL schema routing, branch-aware model caching, squash-merge ordering, conflict resolution for field renames) are all genuinely hard, and the solutions are sound. The critical issues above are mostly edge-case correctness concerns or operational hygiene (model registry leaks, silent data loss on orphaned CT rows) rather than fundamental design flaws.

Recommend: Address items 2 (document post_clear deviation in class docstring), 5 (clean up _TempOldThrough_* model registrations), and 6 (upgrade silent debug log to warning for orphaned poly-M2M rows) before merge. Items 1, 3, 4 are lower priority.

@arthanson
Copy link
Copy Markdown
Contributor Author

All claude concerns addressed.

@arthanson arthanson marked this pull request as ready for review May 29, 2026 23:37
@arthanson arthanson requested a review from bctiemann May 29, 2026 23:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant