Implement automatic cookie refreshing#11
Conversation
|
This pull request has been marked to automatically sync to its base branch. You can disable this behavior by removing the label. |
There was a problem hiding this comment.
Pull request overview
This pull request implements automatic cookie refreshing functionality to maintain valid authentication cookies for persistent organization sessions. The implementation adds a background scheduler that periodically refreshes cookies for organizations that have opted into persistent session management.
Changes:
- Added APScheduler dependency for background job scheduling
- Implemented
check_or_refresh_cookie()function to validate and refresh authentication cookies - Added persistent cookie storage with background refresh mechanism running every 5 minutes
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| css-reports/requirements.txt | Added APScheduler dependency for background task scheduling |
| css-reports/report.py | Added check_or_refresh_cookie() function to validate and refresh authentication cookies by querying the organization admin page |
| css-reports/app.py | Implemented persistent cookie storage, background refresh job, and integration with the fetch_customisation_report endpoint via a new persist parameter |
Comments suppressed due to low confidence (5)
css-reports/report.py:36
- The check_or_refresh_cookie function does not use the ssl_context when making the GET request, while other similar functions in this file (get_msl_context, fetch_report_url_and_cookies, get_product_customisations) all use ssl=ssl_context parameter. This inconsistency could lead to SSL verification issues or security concerns. Consider adding ssl=ssl_context to the session.get() call for consistency.
session.get(f"https://www.guildofstudents.com/organisation/admin/{org_id}") as response,
css-reports/app.py:23
- There is no error handling in the refresh_persistent_cookies function. If check_or_refresh_cookie fails for any organization (network error, authentication failure, etc.), the entire refresh job will fail and stop processing remaining organizations. Consider wrapping the loop body in a try-except block to handle errors gracefully and continue refreshing other organizations' cookies even if one fails.
async def refresh_persistent_cookies() -> None:
for org_id, (auth_cookie, _) in persistent_organisations.items():
persistent_organisations[org_id] = (auth_cookie, await check_or_refresh_cookie(org_id, auth_cookie))
css-reports/report.py:44
- The check_or_refresh_cookie function does not handle potential errors from the HTTP request. If the request fails due to network issues, timeouts, or server errors, the function will raise an unhandled exception. Consider adding error handling to gracefully handle failures and potentially return the original cookie if the refresh attempt fails.
async def check_or_refresh_cookie(org_id: str, auth_cookie: str) -> str:
"""Queries the org page and the cookie will either be returned if valid or a new cookie returned."""
async with (
aiohttp.ClientSession(headers=BASE_HEADERS, cookies={".AspNet.SharedCookie": auth_cookie}) as session,
session.get(f"https://www.guildofstudents.com/organisation/admin/{org_id}") as response,
):
await response.text()
returned_asp_cookie: Morsel[str] | None = response.cookies.get(
".AspNet.SharedCookie"
)
return returned_asp_cookie.value if returned_asp_cookie else auth_cookie
css-reports/app.py:79
- The condition 'organisation_id in persistent_organisations.keys()' can be simplified to 'organisation_id in persistent_organisations' for better readability and performance, as the 'in' operator on dictionaries checks keys by default without needing to call .keys().
if organisation_id in persistent_organisations.keys():
css-reports/requirements.txt:8
- The new dependency
APScheduleris added without a pinned version inrequirements.txt, so builds will always pull the latest release from PyPI. If a future version or its distribution channel is compromised, your deployments could silently run attacker-controlled code with access to application data and secrets. PinAPSchedulerto a specific vetted version (and ideally enforce hashes via a lockfile or--require-hashes) so that dependency contents remain stable and auditable.
APScheduler
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.
Comments suppressed due to low confidence (9)
css-reports/app.py:127
- The BackgroundScheduler is started but never properly shut down. When the Flask application stops, the scheduler thread may not terminate gracefully, potentially leaving pending jobs running or causing the application to hang during shutdown. Consider adding scheduler.shutdown() in an atexit handler or using a context manager pattern to ensure proper cleanup.
scheduler = BackgroundScheduler()
scheduler.add_job(func=lambda: asyncio.run(refresh_persistent_cookies()), trigger="interval", minutes=5)
scheduler.start()
css-reports/app.py:126
- Using asyncio.run() within a lambda in the scheduler can cause issues. Each call to asyncio.run() creates a new event loop, runs the coroutine, and then closes the loop. In a long-running application with frequent scheduled calls, this repeated loop creation/destruction is inefficient and could potentially conflict with Flask's async handling. Consider using a persistent event loop or making the scheduler job a sync function that manages the async calls properly, or using an async scheduler like aiojobs or APScheduler's AsyncIOScheduler.
scheduler.add_job(func=lambda: asyncio.run(refresh_persistent_cookies()), trigger="interval", minutes=5)
css-reports/report.py:37
- The session.get() call doesn't pass the ssl_context parameter that is used elsewhere in the codebase (e.g., in get_msl_context at line 62). This inconsistency could lead to SSL verification issues. The ssl_context should be passed to ensure consistent SSL handling across all HTTP requests.
async with (
aiohttp.ClientSession(headers=BASE_HEADERS, cookies={".AspNet.SharedCookie": auth_cookie}) as session,
session.get(f"https://www.guildofstudents.com/organisation/admin/{org_id}") as response,
):
css-reports/app.py:82
- The dictionary access pattern persistent_organisations.keys() is redundant. In Python, checking "if organisation_id in persistent_organisations:" is the idiomatic and more efficient way to check for key existence. The .keys() method is unnecessary here.
if organisation_id in persistent_organisations.keys():
css-reports/app.py:50
- The new persist query parameter is not documented in the API documentation. The README.md file should be updated to include the persist parameter in the Query Parameters section (around line 60), explaining that when set to a truthy value (true, 1, t, y, yes, on), the authentication cookie will be cached and automatically refreshed for subsequent requests with the same organisation_id.
persist: str | None = request.args.get("persist")
css-reports/app.py:18
- The persistent_organisations dictionary grows indefinitely without any cleanup mechanism. Each unique organisation_id that uses persist=true will be stored in memory forever, along with its cookies. In a long-running application, this could lead to unbounded memory growth. Consider implementing a mechanism to remove old entries (e.g., based on last access time, TTL, or maximum size) or providing an administrative endpoint to clear persisted organizations.
persistent_organisations: dict[str, tuple[str, str]] = {}
css-reports/report.py:44
- The function doesn't handle HTTP errors from the request. If the request fails (network error, 404, 500, etc.), the function will raise an exception that won't be caught by the caller. Consider adding error handling or at minimum checking the response status with response.raise_for_status().
async def check_or_refresh_cookie(org_id: str, auth_cookie: str) -> str:
"""Queries the org page and the cookie will either be returned if valid or a new cookie returned."""
async with (
aiohttp.ClientSession(headers=BASE_HEADERS, cookies={".AspNet.SharedCookie": auth_cookie}) as session,
session.get(f"https://www.guildofstudents.com/organisation/admin/{org_id}") as response,
):
await response.text()
returned_asp_cookie: Morsel[str] | None = response.cookies.get(
".AspNet.SharedCookie"
)
return returned_asp_cookie.value if returned_asp_cookie else auth_cookie
css-reports/app.py:26
- The refresh_persistent_cookies function doesn't handle exceptions that may be raised by check_or_refresh_cookie. If any organization's cookie refresh fails (network error, invalid response, etc.), the entire refresh job will fail and stop processing remaining organizations. Consider wrapping the await call in a try-except block to log errors and continue processing other organizations.
async def refresh_persistent_cookies() -> None:
for org_id, (_, auth_cookie) in persistent_organisations.items():
persistent_organisations[org_id] = (
auth_cookie,
await check_or_refresh_cookie(org_id, auth_cookie)
)
css-reports/app.py:126
- The scheduler configuration specifies minutes=5 but this will trigger every 5 minutes regardless of how long the refresh operation takes. If refreshing all organizations takes longer than 5 minutes (as the number of persisted organizations grows), multiple refresh jobs could run concurrently, exacerbating the race condition issue. Consider using a misfire_grace_time parameter or checking if the previous job has completed before starting a new one.
scheduler.add_job(func=lambda: asyncio.run(refresh_persistent_cookies()), trigger="interval", minutes=5)
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
No description provided.