Remove Selenium from Enfield #1815

PaulBrack · 2026-01-16T13:12:12Z

Summary by CodeRabbit

Improvements
- Optimized Enfield Council bin collection data retrieval for enhanced reliability and performance.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2026-01-16T13:12:24Z

📝 Walkthrough

Walkthrough

The EnfieldCouncil council parser transitions from Selenium-based web scraping with Cloudflare handling and iframe navigation to direct HTTP API requests using curl_cffi. This simplification reduces code complexity by replacing HTML parsing with JSON processing, extracting bin collection data from API responses instead of rendered web pages.

Changes

Cohort / File(s)	Summary
Enfield Council Scraper Refactor `uk_bin_collection/uk_bin_collection/councils/EnfieldCouncil.py`	Replaced Selenium web driver flow with curl_cffi-based HTTP requests for UPRN data retrieval. Removed postcode/paon handling, WebDriver initialization, Cloudflare/iframe navigation, and DOM parsing. Added JSON response processing with JobName/Description extraction as bin type and ScheduledStart parsing as collection date. Simplified output to `{"bins": [{"type", "collectionDate"}]}` format with validation and sorting by date.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

#1742: Related council parser update replacing Selenium/HTML scraping with UPRN-based API calls and standardized JSON-driven output structure

Poem

🐰 No more Selenium's dance, no WebDriver delays,
Direct API calls now light the bin-finding ways,
From HTML chaos to JSON's clean refrain,
The scraper leaps forward, with effort and gain! ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Remove Selenium from Enfield' directly and clearly summarizes the main change: replacing Selenium-based web scraping with direct HTTP requests in the Enfield Council module.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@uk_bin_collection/uk_bin_collection/councils/EnfieldCouncil.py`:
- Around line 38-42: The code that extracts job_name and scheduled can raise
AttributeError if keys like "JobName", "Description", or "ScheduledStart" exist
but are None or non-dict; update the lookups to use a defensive fallback (e.g.,
replace item.get("JobName", {}) and item.get("Description", {}) and
item.get("ScheduledStart", {}) with (item.get("JobName") or {}),
(item.get("Description") or {}), and (item.get("ScheduledStart") or {})
respectively) so that job_name and scheduled assignments in EnfieldCouncil (the
variables job_name and scheduled) safely call .get("_text") without assuming
dicts.

🧹 Nitpick comments (1)

uk_bin_collection/uk_bin_collection/councils/EnfieldCouncil.py (1)
58-60: Hardcoded date format may break if date_format changes.

Line 48 uses date_format from the common module for formatting, but line 59 hardcodes "%d/%m/%Y" for parsing during sort. If they diverge, sorting will raise ValueError.
♻️ Proposed fix to use the shared format constant
         bindata["bins"].sort(
-            key=lambda x: datetime.strptime(x["collectionDate"], "%d/%m/%Y")
+            key=lambda x: datetime.strptime(x["collectionDate"], date_format)
         )

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 274152f and 89fe3ce.

📒 Files selected for processing (1)

uk_bin_collection/uk_bin_collection/councils/EnfieldCouncil.py

🧰 Additional context used

🪛 Ruff (0.14.11)

uk_bin_collection/uk_bin_collection/councils/EnfieldCouncil.py

4-4: from uk_bin_collection.uk_bin_collection.common import * used; unable to detect undefined names

(F403)

10-10: Unused method argument: page

(ARG002)

12-12: check_uprn may be undefined, or defined from star imports

(F405)

31-31: Avoid specifying long messages outside the exception class

(TRY003)

48-48: date_format may be undefined, or defined from star imports

(F405)

49-49: contains_date may be undefined, or defined from star imports

(F405)

🔇 Additional comments (3)

uk_bin_collection/uk_bin_collection/councils/EnfieldCouncil.py (3)

1-5: Import and architecture looks good for this codebase.

The transition from Selenium to curl_cffi is a solid simplification. The star import from common is a pre-existing pattern across council implementations, and the unused page parameter is inherited from AbstractGetBinDataClass.

14-25: HTTP request configuration looks appropriate.

Good use of timeout and browser impersonation headers. The uprn is validated by check_uprn before use, and UPRNs are numeric, so direct string interpolation is safe here.

27-33: Good defensive check for Cloudflare blocking.

The HTML detection before JSON parsing prevents cryptic parse errors if the API returns a Cloudflare challenge page instead of data.

_{✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.}

coderabbitai · 2026-01-16T13:14:25Z

uk_bin_collection/uk_bin_collection/councils/EnfieldCouncil.py

+            job_name = (
+                item.get("JobName", {}).get("_text")
+                or item.get("Description", {}).get("_text")
            )
-            find_address_button.click()
+            scheduled = item.get("ScheduledStart", {}).get("_text")


⚠️ Potential issue | 🟠 Major

Potential AttributeError if API returns null or non-dict values.

item.get("JobName", {}) only returns {} when the key is absent. If JobName exists but is null (Python None) or a string, calling .get("_text") on it will raise AttributeError.

🐛 Proposed fix using defensive fallback

for item in data: job_name = ( - item.get("JobName", {}).get("_text") - or item.get("Description", {}).get("_text") + (item.get("JobName") or {}).get("_text") + or (item.get("Description") or {}).get("_text") ) - scheduled = item.get("ScheduledStart", {}).get("_text") + scheduled = (item.get("ScheduledStart") or {}).get("_text")

Using (item.get("X") or {}) ensures falsy values like None also fall back to an empty dict.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

job_name = (

item.get("JobName", {}).get("_text")

or item.get("Description", {}).get("_text")

)

find_address_button.click()

scheduled = item.get("ScheduledStart", {}).get("_text")

job_name = (

(item.get("JobName") or {}).get("_text")

or (item.get("Description") or {}).get("_text")

)

scheduled = (item.get("ScheduledStart") or {}).get("_text")

🤖 Prompt for AI Agents

In `@uk_bin_collection/uk_bin_collection/councils/EnfieldCouncil.py` around lines 38 - 42, The code that extracts job_name and scheduled can raise AttributeError if keys like "JobName", "Description", or "ScheduledStart" exist but are None or non-dict; update the lookups to use a defensive fallback (e.g., replace item.get("JobName", {}) and item.get("Description", {}) and item.get("ScheduledStart", {}) with (item.get("JobName") or {}), (item.get("Description") or {}), and (item.get("ScheduledStart") or {}) respectively) so that job_name and scheduled assignments in EnfieldCouncil (the variables job_name and scheduled) safely call .get("_text") without assuming dicts.

codecov · 2026-01-17T07:32:38Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 86.67%. Comparing base (274152f) to head (89fe3ce).

Additional details and impacted files

@@           Coverage Diff           @@
##           master    #1815   +/-   ##
=======================================
  Coverage   86.67%   86.67%           
=======================================
  Files           9        9           
  Lines        1141     1141           
=======================================
  Hits          989      989           
  Misses        152      152

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Remove Selenium from Enfield

89fe3ce

coderabbitai bot reviewed Jan 16, 2026

View reviewed changes

PaulBrack marked this pull request as draft January 16, 2026 14:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Remove Selenium from Enfield #1815

Remove Selenium from Enfield #1815

PaulBrack commented Jan 16, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Jan 16, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Jan 16, 2026

Uh oh!

codecov bot commented Jan 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Remove Selenium from Enfield #1815

Are you sure you want to change the base?

Remove Selenium from Enfield #1815

Conversation

PaulBrack commented Jan 16, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jan 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 16, 2026

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Jan 17, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

PaulBrack commented Jan 16, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 16, 2026 •

edited

Loading