Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions ipu/docs/performance.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,3 +42,12 @@ def update(i, opt_state, batch):
will result into an IPU jitted function where only the `batch` is transfered at every call from host to device, and the `opt_state` remains on the IPU SRAM (after being transfered at the first call). The training loop does not require any additional modification.

Please refer to the [MNIST example](../examples/mnist_classifier.py) for a full example of buffer donation on the IPU.


## Write a custom op

One of the joys of IPU programming is that programming at the tile level is often conceptually easier than on GPU systems because the IPU tile processor behaves like a conventional processor, programmed in C++. See an example at
[custom_primitive_test.py](../tests/ipu/primitive/custom_primitive_test.py).

For further examples, see [demo_vertex.cpp](https://github.com/graphcore-research/tessellate-ipu/blob/main/examples/demo/demo_vertex.cpp) in the [TessellateIPU library](https://github.com/graphcore-research/tessellate-ipu).