Skip to content

Conversation

@MrCreosote
Copy link
Member

If a caller attempts to run too many tasks on the callback server at once when run standalone, the JobRunner._watch loop will exit and the callback server will hang forever. This duplicates (for now) the job counting logic in the callback server itself so it doesn't submit jobs guaranteed to fail to the _watch loop. Eventually all this should likely be refactored.

Also catches and reports bad module name errors.

If a caller attempts to run too many tasks on the callback server at
once when run standalone, the JobRunner._watch loop will exit and the
callback server will hang forever. This duplicates (for now) the job
counting logic in the callback server itself so it doesn't submit jobs
guaranteed to fail to the _watch loop. Eventually all this should likely
be refactored.

Also catches and reports bad module name errors.
@MrCreosote MrCreosote requested a review from bio-boris August 8, 2025 23:48
@codecov
Copy link

codecov bot commented Aug 8, 2025

Codecov Report

❌ Patch coverage is 76.92308% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 80.55%. Comparing base (95f9e56) to head (13ca137).
⚠️ Report is 6 commits behind head on main.

Files with missing lines Patch % Lines
JobRunner/callback_server.py 70.00% 6 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #108      +/-   ##
==========================================
+ Coverage   80.43%   80.55%   +0.11%     
==========================================
  Files          13       13              
  Lines        1140     1147       +7     
==========================================
+ Hits          917      924       +7     
  Misses        223      223              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@MrCreosote
Copy link
Member Author

the uncovered stuff is again due to the callback server running in a different process

@MrCreosote
Copy link
Member Author

codacy is complaining about asserts in tests again

@bio-boris bio-boris requested a review from Copilot August 11, 2025 15:17
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes a critical issue where the callback server would hang forever when too many jobs were submitted, causing the JobRunner._watch loop to exit. The fix introduces job counting logic in the callback server to prevent submitting jobs that are guaranteed to fail.

Key changes:

  • Adds configurable max_tasks parameter to prevent job submission when at capacity
  • Improves error handling for bad module names and method parsing
  • Enhances test coverage for job limits and error conditions

Reviewed Changes

Copilot reviewed 7 out of 8 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
JobRunner/callback_server.py Implements job counting logic and improved error handling to prevent server hangs
JobRunner/config.py Adds max_tasks parameter to Config class constructor
JobRunner/Callback.py Passes max_tasks parameter through Callback class
JobRunner/JobRunner.py Updates callback server initialization to include max_tasks
test/test_callback_server_integration.py Adds comprehensive tests for job limits and error conditions
test/test_callback_server.py Updates test configuration to include new jobcount and maxjobs parameters
pyproject.toml Adds ipython development dependency

# the ip, which doesn't work if you're behind a NAT.
# All of this needs a rethink / refactor
s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
s.connect(("gmail.com", 80))
Copy link

Copilot AI Aug 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using gmail.com as a target for socket connection is unreliable and creates an external dependency in tests. Consider using a more stable service or implementing a local network discovery method.

Suggested change
s.connect(("gmail.com", 80))
# Use a non-routable IP address to determine local IP without external dependency
s.connect(("10.255.255.255", 1))

Copilot uses AI. Check for mistakes.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If gmail.com isn't stable nothing is... also this is how it's done in other places in the codebase, it was just copied from there

_check_rpc_token(app, token)
job_id = str(uuid.uuid1())
data["method"] = "%s.%s" % (module, method[1:-7])
job_id = str(uuid.uuid4())
Copy link

Copilot AI Aug 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change from uuid.uuid1() to uuid.uuid4() should be consistent throughout the codebase. However, there's still a uuid.uuid1() call on line 93 in the test file that wasn't updated.

Copilot uses AI. Check for mistakes.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 93 of the test file is a switch from uuid1 to uuid4

"error": exception_message,
"code": "123",
"message": exception_message,
"name": "CallbackServerError",
Copy link

Copilot AI Aug 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing space before the comma - should be 'CallbackServerError', not 'CallbackServerError', (note the trailing space).

Suggested change
"name": "CallbackServerError",
"name": "CallbackServerError",

Copilot uses AI. Check for mistakes.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed locally, it'll show up in a future PR


def _restore_env(envvar, old_val):
if old_val is None:
os.environ.pop(envvar)
Copy link

Copilot AI Aug 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using os.environ.pop() without a default value will raise KeyError if the environment variable doesn't exist. This could cause test failures in certain environments.

Suggested change
os.environ.pop(envvar)
os.environ.pop(envvar, None)

Copilot uses AI. Check for mistakes.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case it's guaranteed the env var is present.

@MrCreosote MrCreosote merged commit 31a1753 into main Aug 11, 2025
10 of 12 checks passed
@bio-boris bio-boris mentioned this pull request Aug 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants