GPUcacher
GPUcacher is a software layer and reference implementation intended to accelerate GPU-accelerated workloads by caching data and intermediate results near the compute units. It targets data reuse patterns common in graphics, science, and machine learning pipelines, reducing memory bandwidth pressure and latency when the same data are accessed multiple times within or across kernels.
At runtime, GPUcacher partitions memory into caches that reside on the GPU, in unified memory, and, where
Architecturally, GPUcacher provides an API layer that intercepts memory requests from kernels and translates them to
The project offers bindings for common GPU programming interfaces such as CUDA and OpenCL, and aims to
Performance benefits are workload-dependent. When data exhibit temporal locality or repeated access across kernels, substantial bandwidth
GPUcacher originated as a research idea in computer architecture and has evolved through open-source collaborations and