Skip to content

Conversation

@Jenniliu12
Copy link

@Jenniliu12 Jenniliu12 commented Jan 11, 2026

PR Checklist

  • Referenced issue is linked
  • If you've fixed a bug or added code that should be tested, add tests!
  • Documentation in docs is updated

Description of changes
adding new tool.

Technical details
Technically, the most interesting part is the data itself. Most of the methods that come with this tool are wrappers that aid with the enrichment analysis (e.g., help create a ranked list from query data, or help navigate the different levels of the data).
The plotting tools are seaborn heatmap wrappers that are tailored to the output format.

Only the cytokine communication methods are specific to this tool. (get_senders and receivers()). This plotting function is specific to this tool.

Additional context
new feature "hucira". Lukas wants to check if methods are too specific.
Some of the code might actually be redundant.. I had implemented the wrappers to make simple visualizations a click away for users but I can rethink about making the tool more lightweight.

The only dependencies I added (bokeh and pyCirclize) are for visualization tools

The very generic tests that I wrote are not passing. I don't know why, it looks like they're timing out.

@Jenniliu12 Jenniliu12 marked this pull request as draft January 12, 2026 22:20
Copy link
Member

@Zethson Zethson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much already! Here's some first initial feedback:

  1. Please ensure that you have a descriptive pull request title and the pull request description only has the necessary detail. This can also be done later but it's important.
  2. Your implementation should not add any dependencies. Pertpy has to minimize dependencies because it's used by a loooot of people. Every dependency that we add has the potential to break users environment and increases complexity. Therefore, please remove bokeh, pycirclize, and tqdm. The plot should be implemented with seaborn and instead of using tqdm, you can use Rich directly. Not sure what to do about pycirclize - if necessary we can talk about it. That's one of the differences of implementing something for yourself vs implementing it for many.
  3. Please ensure that all functions that have a bit more than trivial complexity have a proper docstring. Please adhere to our docstring style.
  4. I didn't review everything yet because I think there's still a lot to do. Let's address these first.

return adata


def human_cytokine_dict(exclude_well_biased_genes=True) -> pd.DataFrame:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please type this.

Suggested change
def human_cytokine_dict(exclude_well_biased_genes=True) -> pd.DataFrame:
def human_cytokine_dict(exclude_well_biased_genes: bool = True) -> pd.DataFrame:



def human_cytokine_dict(exclude_well_biased_genes=True) -> pd.DataFrame:
r"""Human Cytokine Dictionary curated from PBMC allows you to infer differential cytokine activity.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
r"""Human Cytokine Dictionary curated from PBMC allows you to infer differential cytokine activity.
"""Human Cytokine Dictionary curated from PBMC allows you to infer differential cytokine activity.

this doesn't need to be a raw string, right?

def human_cytokine_dict(exclude_well_biased_genes=True) -> pd.DataFrame:
r"""Human Cytokine Dictionary curated from PBMC allows you to infer differential cytokine activity.
The Human Cytokine Dictionary was created from single-cell RNA-seq of 9,697,974 human peripheral blood mononuclear cells (PBMC) from 12 donors stimulated in vitro with 87 different cytokines. The object is a dataframe representing cytokine activity as differentially expressed genes after cytokine perturbation.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please always write sentences in their own line.

Returns:
Pandas DataFrame
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

Nit

The Human Cytokine Dictionary was created from single-cell RNA-seq of 9,697,974 human peripheral blood mononuclear cells (PBMC) from 12 donors stimulated in vitro with 87 different cytokines. The object is a dataframe representing cytokine activity as differentially expressed genes after cytokine perturbation.
References:
Oesinghaus, Lukas and Becker, S{\"o}ren and Vornholz, Larsen
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO

2. Creates ranking of query data genes contrasting condition1 vs condition2. A continuum from genes most associated with condition1 (top) to genes most associated with condition2 (bottom)
3. Computes enrichment of each cytokine by matching their associated gene set in the ranked list.
Parameters
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docstring

verbose: bool = False,
threads: int = 6,
) -> pd.DataFrame:
"""Function wrapper: Computes cytokine enrichment activity in one celltype using GSEA scoring. Loops through several threshold value to obtain more robust gene sets.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"""Function wrapper: Computes cytokine enrichment activity in one celltype using GSEA scoring. Loops through several threshold value to obtain more robust gene sets.
"""Computes cytokine enrichment activity in one celltype using GSEA scoring.
Loops through several threshold value to obtain more robust gene sets.

It's a rule that the first line is always just one sentence. No exceptions.

weight: float = 1.0,
seed: int = 2025,
verbose: bool = False,
threads: int = 6,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a super random value

ranked_stats, _num_cells = hucira._compute_ranking_statistic(dummy_adata, contrast_column, contrasts_combo)
assert isinstance(ranked_stats, pd.DataFrame)

# with pytest.raises(KeyError):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't be commented out, right?

("B cell", "B_cell"),
("CD8a", "CD8_T_cell"),
("Mono", "CD14_Mono"),
] # can't be a list for "run_one_enrichment_test()"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what's meant with that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants