I'm a Master's student in Computer Science at EPFL, focused on machine learning systems and AI infrastructure.
I previously worked at Oracle Labs Zurich, where I contributed to the open-source projects Wayflow and Agent-Spec.
- Built optimized GEMM kernels on NVIDIA T4
- Hierarchical tiling + vectorized memory access
- Reached ~82% of cuBLAS performance
- π Code
- Ranked #1 on LeetGPU, 5.3Γ faster than baseline
- Used shared/constant memory + register blocking
- Profiled with Nsight Compute
- π Code
- Parallelized PDE solver with MPI + CUDA
- Achieved 20Γ (MPI) and 76Γ (GPU) speedups
- Built OCR system for chess scores (CNN-RNN, PyTorch)
- Reduced WER from 44% β 29%
- π Code