Skip to content
View EmbroiderSnow's full-sized avatar
  • Peking University
  • Beijing, China
  • 23:41 (UTC +08:00)

Highlights

  • Pro

Block or report EmbroiderSnow

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
EmbroiderSnow/README.md

Hi there 👋

I'm Zhaoyuan Bi, an undergraduate student in Computer Science at Peking University (Class of 2027).


🔭 What I'm working on

I focus on Machine Learning Systems, especially GPU kernel optimization for LLM inference.

Recently, I have been working on:

  • CUDA kernel development in GGML / Llama.cpp
  • Optimization of quantization, RoPE, and vecdot kernels
  • Performance analysis using Nsight (memory access, CPI, bottlenecks)
  • Improving end-to-end inference throughput

⚙️ Interests

  • GPU Computing (CUDA)
  • LLM Inference Optimization
  • Parallel Algorithms & Memory Optimization
  • Systems for Machine Learning (MLSys)

🌱 Currently exploring

  • Parallel primitives (e.g., scan, reduction)
  • Performance-critical kernel design
  • Memory-bound optimization in GPU workloads

🛠 Languages & Tools

C C++ CUDA Python Go

Popular repositories Loading

  1. MIT-6.828-JOS-DOC-Beautify MIT-6.828-JOS-DOC-Beautify Public

    HTML 1

  2. arap_deformation arap_deformation Public

    Lab of Frontiers of Geometric Computation(2025 Spring PKU)

    C++

  3. pointconv_pytorch pointconv_pytorch Public

    Repreduct of PointConv: Deep Convolutional Networks on 3D Point Clouds. CVPR 2019

    Python

  4. Point2Mesh-via-SDF Point2Mesh-via-SDF Public

    Lab of Frontiers of Geometric Computation(2025 Spring PKU)

    Python

  5. RISC-V-Simulator RISC-V-Simulator Public

    A RV simulator, implement RV-IM.

    C

  6. CacheSimulator CacheSimulator Public

    A simple cache simulator with prefetch.

    Python