fix(process_updates): Fix expirations, cleanup syncing, and add integration tests #157

crankynetman · 2025-05-15T19:36:04Z

This one was a doozie. Expired/deleted entries on one SCRAM instance were never unblocked on other instances because there is no mechanism for postgres to let django know that it needs to re-run stuff, only the active django re-runs stuff. Now, we get around that by calling process_updates and re-updating anything changed recently, however, before, the sync logic only looked at creation time (when field), missing any entries that were removed..

To fix this in a lot less dumb way than we used to, we now we use django-simple-history to detect any entry changes and then go ahead and just reprocess them to what the DB says they should be doing.

There is a lot in this PR, but the main changes are:

Split up process_updates() into:
- get_entries_to_process(cutoff_time)
  - This is a function that we could reuse that simply finds all recently modified entries from other instances and returns them so you can Do Stuff® on em.
- reprocess_entries(entries)
  - This actually "Does Stuff" by sending websocket messages to translators. We could eventually move this out of here and make it more generic to be used everywhere we Do A Websocket® but it needs to be a bit more dynamic to support future action types (see note below, not dealing with this now).
- _check_for_orphaned_history()
  - This is a total edge case, but since we don't ever delete entries (our delete() override) I wanted to make sure that we catch and log any orphaned history objects that don't have a corresponding entry (say someone goes into the admin panel and actually "hard" deletes an entry, but the history stuff is still there).
Made sure to be efficient with DB calls by using select_related(),.exclude(), and values_list() on most of our django/db queries to make sure that we're relying on SQL to do the filtering (using built in django stuff obviously) instead of iterating on the python side (in the case of prod where we have a TON of entries.)
We used to rely on msg.msg_type from WebSocketMessage (which is always translator_add), but now we set that based on entry.is_active when we run the sync from DB to translator. Eventually though this needs to be more dynamic to support other action types, not sure how to solve that though.

There are a few caveats here to keep in mind, but for now i think it's fine:

This breaks from the existing behave patterns, and adds a pattern where we connect directly to the containers running via compose and ignores the test database. I tried to seperate this out into full integration tests. This means that test data is left behind so we blow away everything in the DB to reset it before and after test runs.
It's possible that a docker health check could run while the behave integration tests are running. This is unlikely and I made the health check run less frequently just to help avoid that but... it's gross.

For now, I think this gets us further, so it's still good. It also fixes things to use postgres instead of sqlite, which is a win overall.

…tting recent changes.

…rations

…e into dedicated functions

github-actions · 2025-05-15T19:44:00Z

File	Coverage
All files	`81%`	✅
config/consumers.py	`78%`	✅
config/urls.py	`69%`	✅
config/settings/base.py	`69%`	✅
config/settings/local.py	`72%`	✅
scram/route_manager/admin.py	`71%`	✅
scram/route_manager/models.py	`70%`	✅
scram/route_manager/views.py	`80%`	✅
scram/route_manager/api/serializers.py	`92%`	✅
scram/route_manager/api/views.py	`76%`	✅
scram/shared/shared_code.py	`56%`	✅
scram/templates/403.html	`91%`	✅
scram/templates/404.html	`91%`	✅
scram/templates/base.html	`99%`	✅

Minimum allowed coverage is 50%

Generated by 🐒 cobertura-action against 054086b

…rations

scram/route_manager/views.py

…rations

scram/route_manager/views.py

samoehlert

I think it might be worth documenting this a bit since it's a lot and possibly confusing. 6 month from now Sam will have to start over learning this. I think just a flow of how it works would be beneficial. Maybe in decisions.md?

crankynetman · 2025-12-17T16:41:10Z

I think it might be worth documenting this a bit since it's a lot and possibly confusing. 6 month from now Sam will have to start over learning this. I think just a flow of how it works would be beneficial. Maybe in decisions.md?

Good call, I will do this.

…ponse.

…. also update compose timeouts to be longer.

…rations

pyproject.toml

scram/route_manager/tests/acceptance/steps/multi_instance.py

…t pattern.

scram/route_manager/tests/acceptance/steps/ip.py

scram/route_manager/views.py

samoehlert

tests/acceptance/steps/multi_instance.py seems like it has some duplicate code compared to tests/integration/steps/multi_instance.py and I'm not sure if that's fine or not. I can't decide. I get the difference between acceptance and integration so leaving them separate makes sense, but also DRY.

samoehlert

This is a really big win. And such good tests too!

…ration tests (#157) This one was a doozie. Expired/deleted entries on one SCRAM instance were never unblocked on other instances because there is no mechanism for postgres to let django know that it needs to re-run stuff, only the active django re-runs stuff. Now, we get around that by calling process_updates and re-updating anything changed recently, however, before, the sync logic only looked at creation time (`when` field), missing any entries that were removed.. To fix this in a lot less dumb way than we used to, we now we use `django-simple-history` to detect _**any**_ entry changes and then go ahead and just reprocess them to what the DB says they should be doing. There is a lot in this PR, but the main changes are: - Split up `process_updates()` into: - `get_entries_to_process(cutoff_time)` - This is a function that we could reuse that simply finds all recently modified entries from other instances and returns them so you can Do Stuff® on em. - `reprocess_entries(entries)` - This actually "Does Stuff" by sending websocket messages to translators. We could eventually move this out of here and make it more generic to be used everywhere we Do A Websocket® but it needs to be a bit more dynamic to support future action types (see note below, not dealing with this now). - `_check_for_orphaned_history()` - This is a total edge case, but since we don't ever delete entries (our delete() override) I wanted to make sure that we catch and log any orphaned history objects that don't have a corresponding entry (say someone goes into the admin panel and actually "hard" deletes an entry, but the history stuff is still there). - Made sure to be efficient with DB calls by using `select_related()`,`.exclude()`, and `values_list()` on most of our django/db queries to make sure that we're relying on SQL to do the filtering (using built in django stuff obviously) instead of iterating on the python side (in the case of prod where we have a TON of entries.) - We used to rely on `msg.msg_type` from WebSocketMessage (which is always `translator_add`), but now we set that based on `entry.is_active` when we run the sync from DB to translator. Eventually though this needs to be more dynamic to support other action types, not sure how to solve that though. There are a few caveats here to keep in mind, but for now i think it's fine: - This breaks from the existing behave patterns, and adds a pattern where we connect directly to the containers running via compose and ignores the test database. I tried to seperate this out into full integration tests. This means that test data is left behind so we blow away everything in the DB to reset it before and after test runs. - It's possible that a docker health check could run while the behave integration tests are running. This is unlikely and I made the health check run less frequently just to help avoid that but... it's gross. For now, I think this gets us further, so it's still good. It also fixes things to use postgres instead of sqlite, which is a win overall. # Conflicts: # scram/route_manager/tests/acceptance/steps/common.py # scram/route_manager/tests/acceptance/steps/ip.py

crankynetman added 4 commits March 25, 2025 14:02

attempt(model_history): try adding model history relation

1ec19ea

refactor(views): move to simple and less migration-oriented way of ge…

c6d2712

…tting recent changes.

Merge branch 'main' into topic/chriscummings/fix-process-updates-expi…

b82aa43

…rations

refactor(process_updates): cleanup entry processing logic and separat…

ce380f4

…e into dedicated functions

Merge branch 'main' into topic/chriscummings/fix-process-updates-expi…

0a3151b

…rations

crankynetman commented Nov 18, 2025

View reviewed changes

scram/route_manager/views.py Outdated Show resolved Hide resolved

crankynetman commented Nov 18, 2025

View reviewed changes

scram/route_manager/views.py Outdated Show resolved Hide resolved

crankynetman added 3 commits December 17, 2025 06:46

Merge branch 'main' into topic/chriscummings/fix-process-updates-expi…

0d25230

…rations

fix(local): actually use postgres in local.

8c8790b

fix(process_updates): fix all the pending todos from the MR.

3dace95

samoehlert reviewed Dec 17, 2025

View reviewed changes

scram/route_manager/views.py Outdated Show resolved Hide resolved

samoehlert reviewed Dec 17, 2025

View reviewed changes

chore(typo): fix typo, good catch!

913d33a

crankynetman added 3 commits December 17, 2025 10:50

docs(decisions): updated decisions.md for syncing bug

cedb277

feat(route_manager): bubble up the entries we process to the HTTP res…

a8b1d10

…ponse.

tests(syncing): Add behave and pytest to cover syncing in more detail…

9c7ad89

…. also update compose timeouts to be longer.

crankynetman marked this pull request as ready for review December 18, 2025 02:51

Merge branch 'main' into topic/chriscummings/fix-process-updates-expi…

2f12323

…rations