Problem
Current implementation sends all functions to GPU when options(propr.use_gpu = TRUE) is set. Benchmarks show some functions are much slower on GPU due to transfer overhead and kernel launch costs.
Benchmark Results
GPU Winners (always beneficial):
lrm, lrv: 500x-2500x speedup 🔥
corRcpp, linRcpp, rhoRcpp: 100x-700x speedup
clrRcpp, phiRcpp: 20x-370x speedup
GPU Losers (slower than CPU):
coordToIndex: 0.1x-0.4x (3-9x SLOWER)
count_* functions: 0.01x-0.03x (30-100x SLOWER)
wtvRcpp, wtmRcpp: 0.6x-2x
Solution
Implement internal selective dispatch: automatically route functions to GPU only when beneficial, regardless of global options(propr.use_gpu = TRUE) setting.
bool should_use_gpu_internal(const char* func_name) {
// Never use CPU for these
if (func_name in ["coordToIndex", "count_*", "wtmRcpp", "wtvRcpp"]) {
return false;
}
// Always use GPU for these (when available)
if (func_name in ["lrm", "lrv", "corRcpp", "linRcpp", "rhoRcpp"]) {
return true;
}
}
User Experience
options(propr.use_gpu = TRUE) # Enable GPU globally
# Package automatically decides:
# - lrm/lrv/corRcpp → GPU (fast!)
# - count_*/coordToIndex → CPU (avoid slowdown)
pr <- propr(counts, metric = "rho") # Optimal performance by default
Benefits
- Optimal performance by default
- No performance regressions
- Fewer unnecessary CPU↔GPU transfers
- Users don't need to think about which functions benefit from GPU
Next Steps
- Implement
should_use_gpu_internal() dispatch logic
- Keep global
options(propr.use_gpu) as master switch
- Test on different hardware configurations
Issue created by Luna (AI assistant for @suzannejin)
Problem
Current implementation sends all functions to GPU when
options(propr.use_gpu = TRUE)is set. Benchmarks show some functions are much slower on GPU due to transfer overhead and kernel launch costs.Benchmark Results
GPU Winners (always beneficial):
lrm,lrv: 500x-2500x speedup 🔥corRcpp,linRcpp,rhoRcpp: 100x-700x speedupclrRcpp,phiRcpp: 20x-370x speedupGPU Losers (slower than CPU):
coordToIndex: 0.1x-0.4x (3-9x SLOWER)count_*functions: 0.01x-0.03x (30-100x SLOWER)wtvRcpp,wtmRcpp: 0.6x-2xSolution
Implement internal selective dispatch: automatically route functions to GPU only when beneficial, regardless of global
options(propr.use_gpu = TRUE)setting.User Experience
Benefits
Next Steps
should_use_gpu_internal()dispatch logicoptions(propr.use_gpu)as master switchIssue created by Luna (AI assistant for @suzannejin)