feat: add illicofor rank_genes_groups#4038
feat: add illicofor rank_genes_groups#4038ilan-gold wants to merge 39 commits intoig/exp_post_aggfrom
illicofor rank_genes_groups#4038Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## ig/exp_post_agg #4038 +/- ##
===================================================
- Coverage 79.64% 79.62% -0.02%
===================================================
Files 120 120
Lines 12952 12960 +8
===================================================
+ Hits 10316 10320 +4
- Misses 2636 2640 +4
Flags with carried forward coverage won't be shown. Click here to find out more.
|
|
Hey, good catch for the OVO column ordering. I have no idea where the bug comes from, it seems to be related to group labels for PBMC that |
|
So it seems to be due to np.argsort and np.sort not sorting the weird characters (" ", "+", "/", "#", etc) of the bulk_labels the same way. I will ship a fix sometime today or this weekend. NB: I spent a bit of time looking into the PBMC dataset and it seems rather unusual: a lot of values are negative leading to non-defined logfoldchanges, on top of which there seems to be very little value diversity in the dataset. As a result, a lot of genes end up with identical ranksum, but because illico and scanpy do not compute z-score the exact same way (mathematically equivalent but programmatically different), gene ordering is impacted. The gene ordering impacts the position of NaN values in the results, but non-NaN values do match because they are very similar. I am not sure as if this dataset is a good test case. NB2: details of the programmatic differences: U = ranksum - n_tgt * (n_tgt+ 1)/2
z = U - n_ref * n_tgt / 2scanpy does: z =
ranksum - n_tgt * (n_tgt + 1 + n_ref) / 2Those are mathematically equivalent but result in differences of the order of 1.e-9 approximately, changing gene orders when ranksums are equal. |
Interesting, good observation but glad we are aware of this now. This could be another target for scanpy 2.0. I will try a different dataset but glad to have this documented. |
|
Forget my previous message which was actually kind of off topic. This PBMC dataset was actually a very good test case.
Example: let's take NB: what is the situation with the Dask issue ? Current version is not dask-compatible. |
That should come next, I want to keep this scoped for now to the in-memory stuff. Another thing if you're doing a batch of fixes this weekend - supporting Also for your |
I'm seeing some instances of https://github.com/satijalab/seurat/blob/main/R/differential_expression.R#L1089 i.e., they calculate the mean expression with this "+1" offset so don't need the correction. So we have three different things here, I guess. I think just giving the option to match scanpy (since that seems a project goal) is good, and then we can maybe revisit what seurat does as a new parameter for scanpy 2.0.
For this, I would be open to also making this part of a scanpy 2.0 set of default i.e., using Overall, my ideal scenario would be that Short of exact matches, like I said we can just document changes. It sounds like the biggest things that would close the gap are:
I think we can patch the Overall, this is really cool and I'm super happy that things seem to be pretty close! |
|
Hey, I just released version 0.5.0rc1 which should fix the issues identified on this test case:
Regarding the numerical precision question: so far I force the Testing locally, it seems everything now runs smoothly for PMBC. Let me know how it goes with the rest of the CI. |
|
@remydubois I think the issue is that you are sorting / cutting off by In any case, I think I got a little lost-in-the-sauce. I think matching p-values and scores should be enough because scanpy can internally handle LFC when it wraps I'm pushing the results of With this in mind, I am going to push ahead with integrating BTW: https://remydubois.github.io/illico/api.html looks both incomplete and like it's leaking internals. |
|
Ok everything passing locally with just getting P-value and z-score from I assume the LFC calculation (i.e., actually calculating the means) is the less computationally intensive part? |
Ok ! That's good news to me. Because this was bugging me out a little bit: I checked out on your def test_illico(test, corr_method, exp_post_agg):
from illico.asymptotic_wilcoxon import asymptotic_wilcoxon
pbmc = pbmc68k_reduced()
pbmc = ad.AnnData(pbmc.X.copy(), obs=pbmc.obs.copy(), var=pbmc.var.copy())
# ... Rest of the test. They all pass on my machineThen, all tests passed (LFC, pvals, scores, pvals_adj, up to By the way, wouldn't it be safer for
Very happy to read it's moving forward 🚀
Yes it's neglectable, I could not measure its impact on the overall runtime. |
Oh interesting, I wonder if there is a bug in
Probably, yes. I'll look more at this tomorrow now that you seem to have whittled things down quite a bit. |
|
Ok very simple - If you use |
|
Azure Pipelines: 1 pipeline(s) were filtered out due to trigger conditions. |
|
Ok this is really cooking now. I have matching results on random subsets of some real world data from |
|
Also, I realized that illico does not support the 'groups' argument (which I assume allows user to only compute DE on certain perturbations/labels). |
|
@remydubois Great point - let me try adding a test parametrization here for that. I assume it would work since we construct the anndata object manually for It looks like we might need a groups argument then. |
|
ok @remydubois in the meantime in this PR I will filter since it requires the same amount of data and |
|
Just thinking about it, but don't forget to set n_threads (or leave that to the user, maybe ?). |
|
@remydubois added it to the list #4038 (comment) for @flying-sheep but I think we have generally relied on an env variable up until now (could be wrong though) |
| tie_correct=tie_correct, | ||
| use_continuity=False, | ||
| alternative="two-sided", | ||
| use_rust=False, |
There was a problem hiding this comment.
From what @remydubois said on zulip: https://scverse.zulipchat.com/#narrow/channel/315570-random/topic/Article.20on.20speeding.20up.20computation/near/581587388 it is basically not worth the headache. Furthermore, if we were to ever adopt the codebase, it would be the numba version, not the rust version. His observations about Rust largely match ours (the performance is almost always identical to numba)

TODOs:
See: https://github.com/scverse/scanpy/actions/runs/24088645078/job/70268566419?pr=4038
ovo"column" in scores (cc @remydubois)illico(see thefilterwarningson the new test)ovrresults are so far offexp_post_aggfeature feat: allow exponentiation post agg for log-fold-change inrank_genes_groups#4037) toillico, we will add theillicobackend (or all changes have been documented)illico#4012