PSOgpuTuner

a gpu tuner using a distributed particle swarm optimization algorithm to discover best kernel launch parameters for applications involving GEMM and Jacobi Smoothers.
May be extended to arbitrary applications by supplying a kernel_wrapper() function with gpu kernels. For efficient extension to other applications, supply one-time memory copy to gpu and one-time gpu-free functions.

void mem_to_device(device_pointer_t * pointers, int problem_size)

void free_device(device_pointer_t * pointers)

For simple extension to other applications, implement cuda memory management in kernel_wrapper().

Command-line options:

-v: Verbose Output

-m: Multi-device. Enable multiple GPU execution with threads divided among GPUs.

-t: Threads per GPU

-x: target solution blockDim.x

-s: Size of problem. Dimension n of GEMM nxn matrix and jacobi n-unknowns.

-i: maximum number of Iterations.

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
src		src
Makefile		Makefile
README.md		README.md
gpu_tuner.cu		gpu_tuner.cu

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PSOgpuTuner

About

Uh oh!

Releases

Packages

Languages

jcd496/PSOgpuTuner

Folders and files

Latest commit

History

Repository files navigation

PSOgpuTuner

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages