-
Notifications
You must be signed in to change notification settings - Fork 0
Establishes some general functions relating to single-cell DGE #18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…im if would leave none, py-only catch if no metadata would be added, py-only catch and remove fake pseudobulks created by scanpy
|
this looks awesome! |
|
I think it would be worthwhile to include a pre-pseudobulk filter -- e.g. only pseudobulk a sample/cell type pair if there are at least X cells of that cell type in that sample And possibly downstream a corresponding DEG filter that only pulls a DEG comparison if there are a least N samples per group? |
| too_small <- psobject@meta.data[,output.metadata.cell.count] < min.cells | ||
| if (too_small == ncol(psobject)) { | ||
| warning(paste0("Skipping triming pseudobulks smaller than 'min_cells' as NONE were built from more than ", min_cells, " cells.")) | ||
| } else if (too_small > 0) { | ||
| msg_if("\tTrimming ", too_small, " pseudobulks built from fewer than ", min_cells, " cells.") | ||
| psobject <- psobject[,psobject@meta.data[,output.metadata.cell.count] >= min.cells] | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be worthwhile to include a pre-pseudobulk filter -- e.g. only pseudobulk a sample/cell type pair if there are at least X cells of that cell type in that sample
This is included already, here for the R function! It runs after the pseudobulking currently, but could move it to before instead if there's good reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh awesome! apologies, I should have looked more carefully. I just realized it is not in the dreamlet pseudobulk function, but then is implemented in the processAssays, so was kind of making a note to myself
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do think the downstream DEG filter to at least min.samples per category though is also useful
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed! got pulled away before posting that half =)
Hmm Agreed. Perhaps a function that assesses the requested DGE comps per the |
In a recent DS Working Group meeting, we discussed the utility of adding some standardized functions for performing DGE with various tools. We also laid out a few helper functions -- pseudobulking, gene filtering -- that felt required across tools.
This PR will directly include the helper functions and I'd propose that we use this
sc-dge-functions-branch as the base branch that we'll PR all of our tool-specific DGE function builds in to!Planned functions: