This is a sparse matrix-vector multiplication (SpMV) library that generates SpMV code specialized for a given matrix. Code is generated and executed dynamically for the X86_64 architecture. For technical details see:
- Buse Yilmaz, Baris Aktemur, Maria Garzaran, Sam Kamin, Furkan Kirac.
Autotuning Runtime Specialization for Sparse Matrix-Vector Multiplication.
ACM Transactions on Architecture and Code Optimization (TACO). Volume 13, Issue 1, Article 5. - Sam Kamin, Maria Garzaran, Baris Aktemur, Danqing Xu, Buse Yilmaz, Zhongbo Chen.
Optimization by Runtime Specialization for Sparse Matrix-Vector Multiplication.
GPCE 2014: The 13th International Conference on Generative Programming: Concepts & Experiences, Västerås, Sweden.
main.cppmatrix.*: Matrix class that keeps matrix information in CSR format.method.*: Specialization methods.profiler.*: Time measurement support.svmAnalyzer.*: Feature extration from the matrices. Features are used for autotuning (done separately, not integrated here).plaincsr.*: SpMV implementation using the CSR format.mkl.*: SpMV using Intel MKL (callsmkl_dcsrmvafter setting the parameters).
CSRbyNZRowPatternGenOSKI <r> <c>(Similar to PBR.GenOSKI44is for 4x4 block size,GenOSKI55is for 5x5.)UnfoldingCSRWithGOTOUnrollingWithGOTO
See the papers for details. For each method, there exist a corresponding .cpp file.
Thundercat uses asmjit for
assembling executable code at runtime.
It should be placed under the main thundercat folder.
Below are the steps.
In these steps, we use cmake to generate build files for Ninja.
You may generate build files for other targets as well.
In particular, if you don't have Ninja installed, replace -G Ninja
below with -G "Unix Makefiles", and replace ninja with make.
- Clone this git repository.
~ $ git clone https://github.com/ozusrl/thundercat.git - Clone asmjit under thundercat
~ $ cd thundercat ~/thundercat $ git clone https://github.com/asmjit/asmjit.git - Build asmjit.
~/thundercat $ cd asmjit ~/thundercat/asmjit $ mkdir build ~/thundercat/asmjit $ cd build ~/thundercat/asmjit/build $ cmake -G Ninja -DCMAKE_BUILD_TYPE=Release .. ~/thundercat/asmjit/build $ ninja ~/thundercat/asmjit/build $ cd ../.. - Build thundercat.
~/thundercat $ mkdir build ~/thundercat $ cd build ~/thundercat/build $ cmake -G Ninja -DCMAKE_BUILD_TYPE=Release ../src ~/thundercat/build $ ninja
This will produce a main executable file, named thundercat.
The library runs on Mac OS X and Linux.
Setting the CMAKE_BUILD_TYPE variable to Debug is a good idea
for a build that you will use for debugging purposes.
All the benchmarkings, however, should be done using a build
configured as Release.
To force a particular compiler, e.g. icc, do the following:
~/thundercat/build $ cmake -G Ninja -DCMAKE_BUILD_TYPE=Release -DCMAKE_C_COMPILER=`which icc` -DCMAKE_CXX_COMPILER=`which icc` ../src
./thundercat <matrixName> <methodName> [optional flags]
<matrixName> is the path to the .mtx file
(i.e. the matrix file as downloaded from the Matrix Market or the U. of Florida collection).
The name should be provided without the .mtx extention.
The following are recognized as <methodName>:
- Specialization methods:
CSRbyNZ,RowPattern,Unfolding,GenOSKI33,GenOSKI44,GenOSKI55,UnrollingWithGOTO,CSRWithGOTO - Non-generative methods:
MKLandPlainCSR
-num_threads <num_threads>: Number of threads to be used. By default, a single thread is used.-debug: Output vector is printed.-dump_matrix: Prints matrix's rows, cols and vals array (in the format required by the chosen method).-dump_object: Dumps the generated object code to current folder into files namedgenerated_XwhereXis a number from 0 up to the thread count.-matrix_stats: Prints thesvmAnalyzer's results for the current matrix.
Run for the rajat22 matrix (assuming rajat22.mtx exists in the current directory), using the RowPattern method with 6 threads:
./thundercat rajat22 stencil -num_threads 6
There is a small set of matrices at
http://srl.ozyegin.edu.tr/matrices_debug.tar.gz
Matrices used in the GPCE 2014 paper are available at