Skip to content

Debug Test Executorlib #812

Closed
jan-janssen wants to merge 13 commits intomainfrom
exedebug
Closed

Debug Test Executorlib #812
jan-janssen wants to merge 13 commits intomainfrom
exedebug

Conversation

@jan-janssen
Copy link
Copy Markdown
Member

No description provided.

dependabot Bot and others added 4 commits March 2, 2026 07:32
Bumps [executorlib](https://github.com/pyiron/executorlib) from 1.8.0 to 1.9.0.
- [Release notes](https://github.com/pyiron/executorlib/releases)
- [Commits](pyiron/executorlib@executorlib-1.8.0...executorlib-1.9.0)

---
updated-dependencies:
- dependency-name: executorlib
  dependency-version: 1.9.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Mar 2, 2026

Binder 👈 Launch a binder notebook on branch pyiron/pyiron_workflow/exedebug

@jan-janssen
Copy link
Copy Markdown
Member Author

@liamhuber 1.8.1 works fine - I am going to start updating from here.

@jan-janssen
Copy link
Copy Markdown
Member Author

exception calling callback for <Future at 0x7fc5b5e25820 state=finished returned int>
Traceback (most recent call last):
  File "/home/runner/work/pyiron_workflow/pyiron_workflow/cached-miniforge/my-env/lib/python3.12/concurrent/futures/_base.py", line 340, in _invoke_callbacks
    callback(self)
  File "/home/runner/work/pyiron_workflow/pyiron_workflow/cached-miniforge/my-env/lib/python3.12/site-packages/pyiron_workflow/mixin/run.py", line 353, in _finish_run
    raise e
  File "/home/runner/work/pyiron_workflow/pyiron_workflow/cached-miniforge/my-env/lib/python3.12/site-packages/pyiron_workflow/mixin/run.py", line 347, in _finish_run
    unique_executor.shutdown(wait=False)
  File "/home/runner/work/pyiron_workflow/pyiron_workflow/cached-miniforge/my-env/lib/python3.12/site-packages/executorlib/executor/base.py", line 162, in shutdown
    self._task_scheduler.shutdown(wait=wait, cancel_futures=cancel_futures)
  File "/home/runner/work/pyiron_workflow/pyiron_workflow/cached-miniforge/my-env/lib/python3.12/site-packages/executorlib/task_scheduler/base.py", line 205, in shutdown
    self._process.join()
  File "/home/runner/work/pyiron_workflow/pyiron_workflow/cached-miniforge/my-env/lib/python3.12/threading.py", line 1146, in join
    raise RuntimeError("cannot join current thread")
RuntimeError: cannot join current thread

@liamhuber
Copy link
Copy Markdown
Member

Awesome, thanks @jan-janssen! If I'm reading that sed correctly, the patch in pyiron/executorlib#943 makes the tests pass here again

@liamhuber
Copy link
Copy Markdown
Member

Hmm, the coverage failure is a legitimate failure, not just a hiccup uploading.

This line is causing a real problem, but at least the entire stack trace is inside pwf

shell: bash -l {0}
timeout-minutes: 8
run: |
sed -i 's/if isinstance(self._process, Thread):/if wait and isinstance(self._process, Thread):/g' /home/runner/work/pyiron_workflow/pyiron_workflow/cached-miniforge/my-env/lib/python3.12/site-packages/executorlib/task_scheduler/base.py
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liamhuber How can I include this ugly patch in the coverage test?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good question. I see it and I'm looking around

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is where I mixin and override standard SlurmClusterExecutor behaviour...

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps in that class we could introduce an override to shutdown?

def shutdown(...):
    if ...:
        special new behaviour
    else:
        super().shutdown(...)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to try something like that, as monkeypatching CacheOverride with the fix that will exist in future versions of BaseExecutor might resolve the coverage integration tests at the same time as the slurm tests...

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stacking a PR to give it a try #813

@liamhuber
Copy link
Copy Markdown
Member

Hmm, the coverage failure is a legitimate failure, not just a hiccup uploading.

This line is causing a real problem, but at least the entire stack trace is inside pwf

Just to quickly note, it's only failing for the dummy test class:

class _CacheTestClusterExecutor(CacheOverride, TestClusterExecutor): ...

@jan-janssen
Copy link
Copy Markdown
Member Author

@liamhuber Just to write down what I understood so far. The add_done_callback() is executed when I cancel the future object, so we see the corresponding error message. The reason I do this is that the future object is no longer updated once the executor is closed, so I would like to stay with the new behavior but I have to think about how we could integrated it with pyiron_workflow.

@liamhuber
Copy link
Copy Markdown
Member

@liamhuber Just to write down what I understood so far. The add_done_callback() is executed when I cancel the future object, so we see the corresponding error message. The reason I do this is that the future object is no longer updated once the executor is closed, so I would like to stay with the new behavior but I have to think about how we could integrated it with pyiron_workflow.

That is fair. Is there any way of filtering the done callbacks to only trigger on certain types of "done-ness", i.e. in pyiron_workflow we attach the future in a way that it won't trigger if the future is done because of a force-cancel?

@liamhuber
Copy link
Copy Markdown
Member

Is there any way of filtering the done callbacks to only trigger on certain types of "done-ness", i.e. in pyiron_workflow we attach the future in a way that it won't trigger if the future is done because of a force-cancel?

No.

If the future has already completed or been cancelled, fn will be called immediately.

@jan-janssen jan-janssen marked this pull request as draft March 2, 2026 18:11
@jan-janssen
Copy link
Copy Markdown
Member Author

@liamhuber I just wanted to validate the tests work with 1.9.0 in addition to 1.8.2

@jan-janssen
Copy link
Copy Markdown
Member Author

With the patch the tests work fine - we are just not able to directly apply it to Test-and-Coverage test. So I am going to close this pull request in favor of the solution in #814

@jan-janssen jan-janssen closed this Mar 2, 2026
@jan-janssen jan-janssen deleted the exedebug branch March 2, 2026 18:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants