Skip to content

Commit 2217b68

Browse files
author
Kent Knox
committed
Updates to the main README.md file to incorporate google group links, and
updates to the build dependencies section.
1 parent 02f07f4 commit 2217b68

File tree

1 file changed

+172
-103
lines changed

1 file changed

+172
-103
lines changed

README.md

Lines changed: 172 additions & 103 deletions
Original file line numberDiff line numberDiff line change
@@ -2,39 +2,86 @@ clFFT
22
=====
33
[![Build Status](https://travis-ci.org/clMathLibraries/clFFT.png)](https://travis-ci.org/clMathLibraries/clFFT)
44

5-
clMath is a software library containing FFT and BLAS functions written in OpenCL. In addition to GPU devices, the libraries also support running on CPU devices to facilitate debugging and multicore programming.
5+
clMath is a software library containing FFT and BLAS functions written
6+
in OpenCL. In addition to GPU devices, the libraries also support
7+
running on CPU devices to facilitate debugging and multicore
8+
programming.
69

7-
clMath 2.1 is the latest version and is available as source only. clMath's predecessor <a href="http://developer.amd.com/tools-and-sdks/heterogeneous-computing/amd-accelerated-parallel-processing-math-libraries/">APPML 1.10</a> has pre-built binaries available for download on both Linux and Windows platforms.
10+
clMath 2.1 is the latest version and is available as source only.
11+
clMath's predecessor APPML 1.10 has pre-built binaries available for
12+
download on both Linux and Windows platforms.
813

914
## Introduction to clFFT
1015

11-
The FFT is an implementation of the Discrete Fourier Transform (DFT) that makes use of symmetries in the FFT definition to reduce the mathematical intensity required from O(N<sup>2</sup>) to O(N log<sub>2</sub>( N )) when the sequence length N is the product of small prime factors. Currently, there is no standard API for FFT routines. Hardware vendors usually provide a set of high-performance FFTs optimized for their systems: no two vendors employ the same interfaces for their FFT routines. clFFT provides a set of FFT routines that are optimized for AMD graphics processors, but also are functional across CPU and other compute devices.
16+
The FFT is an implementation of the Discrete Fourier Transform (DFT)
17+
that makes use of symmetries in the FFT definition to reduce the
18+
mathematical intensity required from O(N2) to O(N log2( N )) when the
19+
sequence length N is the product of small prime factors. Currently,
20+
there is no standard API for FFT routines. Hardware vendors usually
21+
provide a set of high-performance FFTs optimized for their systems: no
22+
two vendors employ the same interfaces for their FFT routines. clFFT
23+
provides a set of FFT routines that are optimized for AMD graphics
24+
processors, but also are functional across CPU and other compute
25+
devices.
1226

13-
The clFFT library is an open source OpenCL library implementation of discrete Fast Fourier Transforms. It:
27+
The clFFT library is an open source OpenCL library implementation of
28+
discrete Fast Fourier Transforms. It:
1429

15-
* Provides a fast and accurate platform for calculating discrete FFTs.
16-
* Works on CPU or GPU backends.
17-
* Supports in-place or out-of-place transforms.
18-
* Supports 1D, 2D, and 3D transforms with a batch size that can be greater than 1.
19-
* Supports planar (real and complex components in separate arrays) and interleaved (real and complex components as a pair contiguous in memory) formats.
20-
* Supports dimension lengths that can be any mix of powers of 2, 3, and 5.
21-
* Supports single and double precision floating point formats.
30+
- Provides a fast and accurate platform for calculating discrete FFTs.
31+
32+
- Works on CPU or GPU backends.
33+
34+
- Supports in-place or out-of-place transforms.
35+
36+
- Supports 1D, 2D, and 3D transforms with a batch size that can be
37+
greater than 1.
38+
39+
- Supports planar (real and complex components in separate arrays) and
40+
interleaved (real and complex components as a pair contiguous in
41+
memory) formats.
42+
43+
- Supports dimension lengths that can be any mix of powers of 2, 3,
44+
and 5.
45+
46+
- Supports single and double precision floating point formats.
2247

2348
## clFFT library user documentation
24-
[Library and API documentation]( http://clmathlibraries.github.io/clFFT/ ) for developers is available online as a GitHub Pages website
49+
50+
[Library and API documentation][] for developers is available online as
51+
a GitHub Pages website
52+
53+
### Google Groups
54+
55+
Two mailing lists have been created for the clMath projects:
56+
57+
- [clmath@googlegroups.com][] - group whose focus is to answer
58+
questions on using the library or reporting issues
59+
60+
- [clmath-developers@googlegroups.com][] - group whose focus is for
61+
developers interested in contributing to the library code itself
2562

2663
## clFFT Wiki
27-
The [project wiki](https://github.com/clMathLibraries/clFFT/wiki) contains helpful documentation, including a [build primer](https://github.com/clMathLibraries/clFFT/wiki/Build)
64+
65+
The [project wiki][clmath@googlegroups.com] contains helpful
66+
documentation, including a [build
67+
primer][clmath-developers@googlegroups.com]
2868

2969
## Contributing code
30-
Please refer to and read the [Contributing](CONTRIBUTING.md) document for guidelines on how to contribute code to this open source project
70+
71+
Please refer to and read the [Contributing][] document for guidelines on
72+
how to contribute code to this open source project. The code in the
73+
/master branch is considered to be stable, and all pull-requests should
74+
be made against the /develop branch.
3175

3276
## License
33-
The source for clFFT is licensed under the [Apache License, Version 2.0]( http://www.apache.org/licenses/LICENSE-2.0 )
77+
78+
The source for clFFT is licensed under the [Apache License, Version
79+
2.0][]
3480

3581
## Example
36-
The simple example below shows how to use clFFT to compute an simple 1D forward transform
3782

83+
The simple example below shows how to use clFFT to compute an simple 1D
84+
forward transform
3885
```c
3986
#include <stdlib.h>
4087

@@ -43,101 +90,123 @@ The simple example below shows how to use clFFT to compute an simple 1D forward
4390

4491
int main( void )
4592
{
46-
cl_int err;
47-
cl_platform_id platform = 0;
48-
cl_device_id device = 0;
49-
cl_context_properties props[3] = { CL_CONTEXT_PLATFORM, 0, 0 };
50-
cl_context ctx = 0;
51-
cl_command_queue queue = 0;
52-
cl_mem bufX;
53-
float *X;
54-
cl_event event = NULL;
55-
int ret = 0;
56-
size_t N = 16;
57-
58-
/* FFT library realted declarations */
59-
clfftPlanHandle planHandle;
60-
clfftDim dim = CLFFT_1D;
61-
size_t clLengths[1] = {N};
62-
63-
/* Setup OpenCL environment. */
64-
err = clGetPlatformIDs( 1, &platform, NULL );
65-
err = clGetDeviceIDs( platform, CL_DEVICE_TYPE_GPU, 1, &device, NULL );
66-
67-
props[1] = (cl_context_properties)platform;
68-
ctx = clCreateContext( props, 1, &device, NULL, NULL, &err );
69-
queue = clCreateCommandQueue( ctx, device, 0, &err );
70-
71-
/* Setup clFFT. */
72-
clfftSetupData fftSetup;
73-
err = clfftInitSetupData(&fftSetup);
74-
err = clfftSetup(&fftSetup);
75-
76-
/* Allocate host & initialize data. */
77-
/* Only allocation shown for simplicity. */
78-
X = (float *)malloc(N * 2 * sizeof(*X));
79-
80-
/* Prepare OpenCL memory objects and place data inside them. */
81-
bufX = clCreateBuffer( ctx, CL_MEM_READ_WRITE, N * 2 * sizeof(*X), NULL, &err );
82-
83-
err = clEnqueueWriteBuffer( queue, bufX, CL_TRUE, 0,
84-
N * 2 * sizeof( *X ), X, 0, NULL, NULL );
85-
86-
/* Create a default plan for a complex FFT. */
87-
err = clfftCreateDefaultPlan(&planHandle, ctx, dim, clLengths);
88-
89-
/* Set plan parameters. */
90-
err = clfftSetPlanPrecision(planHandle, CLFFT_SINGLE);
91-
err = clfftSetLayout(planHandle, CLFFT_COMPLEX_INTERLEAVED, CLFFT_COMPLEX_INTERLEAVED);
92-
err = clfftSetResultLocation(planHandle, CLFFT_INPLACE);
93-
94-
/* Bake the plan. */
95-
err = clfftBakePlan(planHandle, 1, &queue, NULL, NULL);
96-
97-
/* Execute the plan. */
98-
err = clfftEnqueueTransform(planHandle, CLFFT_FORWARD, 1, &queue, 0, NULL, NULL, &bufX, NULL, NULL);
99-
100-
/* Wait for calculations to be finished. */
101-
err = clFinish(queue);
102-
103-
/* Fetch results of calculations. */
104-
err = clEnqueueReadBuffer( queue, bufX, CL_TRUE, 0, N * 2 * sizeof( *X ), X, 0, NULL, NULL );
105-
106-
/* Release OpenCL memory objects. */
107-
clReleaseMemObject( bufX );
108-
109-
free(X);
110-
111-
/* Release the plan. */
112-
err = clfftDestroyPlan( &planHandle );
113-
114-
/* Release clFFT library. */
115-
clfftTeardown( );
116-
117-
/* Release OpenCL working objects. */
118-
clReleaseCommandQueue( queue );
119-
clReleaseContext( ctx );
120-
121-
return ret;
93+
cl_int err;
94+
cl_platform_id platform = 0;
95+
cl_device_id device = 0;
96+
cl_context_properties props[3] = { CL_CONTEXT_PLATFORM, 0, 0 };
97+
cl_context ctx = 0;
98+
cl_command_queue queue = 0;
99+
cl_mem bufX;
100+
float *X;
101+
cl_event event = NULL;
102+
int ret = 0;
103+
size_t N = 16;
104+
105+
/* FFT library realted declarations */
106+
clfftPlanHandle planHandle;
107+
clfftDim dim = CLFFT_1D;
108+
size_t clLengths[1] = {N};
109+
110+
/* Setup OpenCL environment. */
111+
err = clGetPlatformIDs( 1, &platform, NULL );
112+
err = clGetDeviceIDs( platform, CL_DEVICE_TYPE_GPU, 1, &device, NULL );
113+
114+
props[1] = (cl_context_properties)platform;
115+
ctx = clCreateContext( props, 1, &device, NULL, NULL, &err );
116+
queue = clCreateCommandQueue( ctx, device, 0, &err );
117+
118+
/* Setup clFFT. */
119+
clfftSetupData fftSetup;
120+
err = clfftInitSetupData(&fftSetup);
121+
err = clfftSetup(&fftSetup);
122+
123+
/* Allocate host & initialize data. */
124+
/* Only allocation shown for simplicity. */
125+
X = (float *)malloc(N * 2 * sizeof(*X));
126+
127+
/* Prepare OpenCL memory objects and place data inside them. */
128+
bufX = clCreateBuffer( ctx, CL_MEM_READ_WRITE, N * 2 * sizeof(*X), NULL, &err );
129+
130+
err = clEnqueueWriteBuffer( queue, bufX, CL_TRUE, 0,
131+
N * 2 * sizeof( *X ), X, 0, NULL, NULL );
132+
133+
/* Create a default plan for a complex FFT. */
134+
err = clfftCreateDefaultPlan(&planHandle, ctx, dim, clLengths);
135+
136+
/* Set plan parameters. */
137+
err = clfftSetPlanPrecision(planHandle, CLFFT_SINGLE);
138+
err = clfftSetLayout(planHandle, CLFFT_COMPLEX_INTERLEAVED, CLFFT_COMPLEX_INTERLEAVED);
139+
err = clfftSetResultLocation(planHandle, CLFFT_INPLACE);
140+
141+
/* Bake the plan. */
142+
err = clfftBakePlan(planHandle, 1, &queue, NULL, NULL);
143+
144+
/* Execute the plan. */
145+
err = clfftEnqueueTransform(planHandle, CLFFT_FORWARD, 1, &queue, 0, NULL, NULL, &bufX, NULL, NULL);
146+
147+
/* Wait for calculations to be finished. */
148+
err = clFinish(queue);
149+
150+
/* Fetch results of calculations. */
151+
err = clEnqueueReadBuffer( queue, bufX, CL_TRUE, 0, N * 2 * sizeof( *X ), X, 0, NULL, NULL );
152+
153+
/* Release OpenCL memory objects. */
154+
clReleaseMemObject( bufX );
155+
156+
free(X);
157+
158+
/* Release the plan. */
159+
err = clfftDestroyPlan( &planHandle );
160+
161+
/* Release clFFT library. */
162+
clfftTeardown( );
163+
164+
/* Release OpenCL working objects. */
165+
clReleaseCommandQueue( queue );
166+
clReleaseContext( ctx );
167+
168+
return ret;
122169
}
123170
```
124171
125172
## Build dependencies
173+
126174
### Library for Windows
127-
* Windows® 7/8
128-
* Visual Studio 2010 SP1, 2012
129-
* Latest CMake
130-
* An OpenCL SDK, such as APP SDK 2.8
175+
176+
- Windows® 7/8
177+
178+
- Visual Studio 2010 SP1, 2012
179+
180+
- Latest CMake
181+
182+
- An OpenCL SDK, such as APP SDK 2.9
131183
132184
### Library for Linux
133-
* GCC 4.6 and onwards
134-
* Latest CMake
135-
* An OpenCL SDK, such as APP SDK 2.8
185+
186+
- GCC 4.6 and onwards
187+
188+
- Latest CMake
189+
190+
- An OpenCL SDK, such as APP SDK 2.9
191+
192+
### Library for Mac OSX
193+
194+
- Recommended to generate Unix makefiles with cmake
136195
137196
### Test infrastructure
138-
* Latest Googletest
139-
* Latest FFTW
140-
* Latest Boost
197+
198+
- Googletest v1.6
199+
200+
- Latest FFTW
201+
202+
- Latest Boost
141203
142204
### Performance infrastructure
143-
* Python
205+
206+
- Python
207+
208+
[Library and API documentation]: http://clmathlibraries.github.io/clFFT/
209+
[clmath@googlegroups.com]: https://github.com/clMathLibraries/clFFT/wiki
210+
[clmath-developers@googlegroups.com]: https://github.com/clMathLibraries/clFFT/wiki/Build
211+
[Contributing]: CONTRIBUTING.md
212+
[Apache License, Version 2.0]: http://www.apache.org/licenses/LICENSE-2.0

0 commit comments

Comments
 (0)