Tension between Snakemake Data Model and that of Executors like HTCondor

I recently worked on [a storage plugin](https://github.com/snakemake/snakemake-interface-executor-plugins/pull/67) to connect Snakemake with Pelican. [Pelican](https://docs.pelicanplatform.org/about-pelican) is a software platform for creating data federations, and it has been tightly coupled with the HTCondor cluster scheduler. My ultimate goal in working on the storage plugin was to get Snakemake+HTCondor+Pelican playing nicely together according to HTCondor best practices.

This is roughly the data model I'm trying to achieve:
<img width="768" height="668" alt="Image" src="https://github.com/user-attachments/assets/29fe115d-182b-45f5-ace3-b602908fbc8b" />

However, I've come to understand that Snakemake's data model doesn't align with this; rather than delegate the input/output transfer to HTCondor, which knows how to deal with this format and can make decisions about how to handle various things like access tokens, Snakemake will always first fetch the Pelican objects at the Access Point and have HTCondor transfer the files like local file transfers (the ability to delegate these file transfers to HTCondor was introduced by me in #67), making the AP the very bottleneck Pelican tries to solve.

Since this problem has a very similar feel to what we solved with #67, I'm wondering whether there's an opportunity to solve a class of problems rather than knocking these out one by one.

It seems like the common theme here is that Snakemake doesn't have semantics for delegating some things like input/output transfers to its executor plugins. In the HTCondor world view, HTCondor should be responsible for handling as much of the IO transfer stuff as possible because it can make scheduling decisions about what/when stuff should be transferred.

I can imagine a setting in which the HTCondor executor provides a list of transfer protocols it understands (`pelican://`, `osdf://`, `s3://`, `ftp://`, etc.) and that whenever Snakemake encounters one of these, it understands it should let HTCondor handle these transfers.

One thing that's potentially tricky here is that different HTCondor clusters may support different schemes -- while _most_ HTCondor clusters should support `pelican://`, not all will. This makes me hesitant to hardcode anything in the executor itself. Maybe the executor plugin interface could also let cluster administrators set up cluster-wide defaults so it's not up users to figure out which schemes are supported?

In the mean time, I plan to do a bit more research to see whether there are similar classes of problems with other schedulers so any solution that gets cooked up doesn't serve only HTCondor integration.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Tension between Snakemake Data Model and that of Executors like HTCondor #98

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Tension between Snakemake Data Model and that of Executors like HTCondor #98

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions