@@ -23,7 +23,7 @@ is to set up a CMake project. For this, create a new folder for your project
2323and in it create a file ` CMakeLists.txt ` with the following contents:
2424
2525``` cmake
26- cmake_minimum_required(VERSION 3.5.1 )
26+ cmake_minimum_required(VERSION 3.13 )
2727project(celerity_edge_detection)
2828
2929find_package(Celerity CONFIG REQUIRED)
@@ -152,7 +152,7 @@ out the kernel code. Replace the TODO with the following code:
152152``` cpp
153153int sum = r_input[{item[ 0] + 1, item[ 1] }] + r_input[ {item[ 0] - 1, item[ 1] }]
154154 + r_input[ {item[ 0] , item[ 1] + 1}] + r_input[ {item[ 0] , item[ 1] - 1}] ;
155- dw_edge [ item] = 255 - std::max(0, sum - (4 * r_input[ item] ));
155+ w_edge [ item] = 255 - std::max(0, sum - (4 * r_input[ item] ));
156156```
157157
158158This kernel computes a [discrete Laplace
@@ -171,29 +171,28 @@ before the kernel function with the following:
171171
172172```cpp
173173celerity::accessor r_input{input_buf, cgh, celerity::access::neighborhood{1, 1}, celerity::read_only};
174- celerity::accessor dw_edge {edge_buf, cgh, celerity::access::one_to_one{}, celerity::write_only, celerity::no_init};
174+ celerity::accessor w_edge {edge_buf, cgh, celerity::access::one_to_one{}, celerity::write_only, celerity::no_init};
175175```
176176
177177If you have worked with SYCL before, these buffer accessors will look
178- familiar to you. The template parameter is called the ** access mode** and
179- declares the type of access we inted to make on each buffer: We want to
180- ` read ` from our ` input_buf ` , and want to write to our ` edge_buf ` . While
181- there is a ` write ` access mode, we do not care at all about preserving any of
182- the previous contents of ` edge_buf ` , which is why we choose to discard them
183- and use the ` discard_write ` access mode.
178+ familiar to you. Accessors tie kernels to the data they operate on by declaring
179+ the type of access that we want to perform: We want to _ read_ from our
180+ ` input_buf ` , and want to _ write_ to our ` edge_buf ` . Additionally, we do not care
181+ at all about preserving any of the previous contents of ` edge_buf ` , which is why
182+ we choose to discard them by also passing the ` celerity::no_init ` property.
184183
185184So far everything works exactly as it would in a SYCL application. However,
186- there is an additional parameter passed into the ` accessor `
187- constructor that is not present in its SYCL counterpart. In fact, this parameter
188- represents one of Celerity's most important API additions: While access modes
189- tell the runtime system how a kernel intends to access a buffer, it does not
190- include any information about _ where_ a kernel will access said buffer. In
191- order for Celerity to be able to split a single kernel execution across
185+ there is an additional parameter passed into the ` accessor ` constructor that is
186+ not present in its SYCL counterpart. In fact, this parameter represents one of
187+ Celerity's most important API additions: While access modes (such as ` read ` and
188+ ` write ` ) tell the runtime system how a kernel intends to access a buffer, they
189+ do not convey any information about _ where_ a kernel will access said buffer.
190+ In order for Celerity to be able to split a single kernel execution across
192191potentially many different worker nodes, it needs to know how each of those
193192** kernel chunks** will interact with the input and output buffers of a kernel
194193-- i.e., which node requires which parts of the input, and produces which
195- parts of the output. This is where Celerity's so-called ** range mappers**
196- come into play.
194+ parts of the output. This is where Celerity's so-called ** range mappers** come
195+ into play.
197196
198197Let us first discuss the range mapper for ` edge_buf ` , as it represents the
199198simpler of the two cases. Looking at the kernel function, you can see that
@@ -221,8 +220,11 @@ surrounding the current work item.
221220
222221Lastly, there are two more things of note for the call to ` parallel_for ` : The
223222first is the ** kernel name** . Just like in SYCL, each kernel function in
224- Celerity has to have a unique name in the form of a template type parameter.
223+ Celerity may have a unique name in the form of a template type parameter.
225224Here we chose ` MyEdgeDetectionKernel ` , but this can be anything you like.
225+
226+ > Kernel names used to be mandatory in SYCL 1.2.1 but have since become optional.
227+
226228Finally, the first two parameters to the ` parallel_for ` function tell
227229Celerity how many individual GPU threads (or work items) we want to execute.
228230In our case we want to execute one thread for each pixel of our image, except
@@ -249,7 +251,6 @@ Just like the _compute tasks_ we created above by calling
249251handler by calling ` celerity::handler::host_task ` . Add the following code at the end of
250252your ` main() ` function:
251253
252-
253254``` cpp
254255queue.submit([=](celerity::handler& cgh) {
255256 celerity::accessor out{edge_buf, cgh, celerity::access::all{}, celerity::read_only_host_task};
0 commit comments