Skip to content

Commit 32e7dbe

Browse files
committed
Add simple AK example to README
1 parent a8e7d77 commit 32e7dbe

File tree

1 file changed

+38
-0
lines changed

1 file changed

+38
-0
lines changed

README.md

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -190,6 +190,44 @@ Julia v1.11
190190
## 1. What's Different?
191191
As far as I am aware, this is the first cross-architecture parallel standard library *from a unified codebase* - that is, the code is written as [KernelAbstractions.jl](https://github.com/JuliaGPU/KernelAbstractions.jl) backend-agnostic kernels, which are then **transpiled** to a GPU backend; that means we benefit from all the optimisations available on the native platform and official compiler stacks. For example, unlike open standards like OpenCL that require GPU vendors to implement that API for their hardware, we target the existing official compilers. And while performance-portability libraries like [Kokkos](https://github.com/kokkos/kokkos) and [RAJA](https://github.com/LLNL/RAJA) are powerful for large C++ codebases, they require US National Lab-level development and maintenance efforts to effectively forward calls from a single API to other OpenMP, CUDA Thrust, ROCm rocThrust, oneAPI DPC++ libraries developed separately.
192192

193+
As a simple example, this is how a normal Julia `for`-loop can be converted to an accelerated kernel - for both multithreaded CPUs and Nvidia / AMD / Intel / Apple GPUs, **with native performance** - by changing a single line:
194+
195+
<table>
196+
<tr>
197+
<td> CPU Code </td> <td> Multithreaded / GPU code </td>
198+
<tr>
199+
200+
<tr>
201+
<td>
202+
203+
```julia
204+
# Copy kernel testing throughput
205+
206+
function cpu_copy!(dst, src)
207+
for i in eachindex(src)
208+
dst[i] = src[i]
209+
end
210+
end
211+
```
212+
213+
</td>
214+
<td>
215+
216+
```julia
217+
import AcceleratedKernels as AK
218+
219+
function ak_copy!(dst, src)
220+
AK.foreachindex(src) do i
221+
dst[i] = src[i]
222+
end
223+
end
224+
```
225+
226+
</td>
227+
</tr>
228+
</table>
229+
230+
193231
Again, this is only possible because of the unique Julia compilation model, the [JuliaGPU](https://juliagpu.org/) organisation work for reusable GPU backend infrastructure, and especially the [KernelAbstractions.jl](https://github.com/JuliaGPU/KernelAbstractions.jl) backend-agnostic kernel language. Thank you.
194232

195233

0 commit comments

Comments
 (0)