librsb
is a library for sparse matrix computations featuring the Recursive Sparse Blocks
(RSB)
matrix format. This format allows cache efficient and multithreaded
(that is, shared memory parallel) operations on large sparse matrices.
The most common operations necessary to iterative solvers are available, e.g.:
matrixvector multiplication, triangular solution, rows/columns scaling,
diagonal extraction / setting, blocks extraction, norm computation, formats
conversion. The RSB format is especially well suited for symmetric and
transposed multiplication variants.
Most numerical kernels code is auto generated, and the supported numerical
types can be chosen by the user at build time.
librsb
implements the Sparse BLAS standard, as specified in the [BLAS Forum] documents.
You might be interested in:
The RSB format is hierarchical.
Its basic idea is a recursive subdivision of a matrix in the two dimensions.
The subdivision algorithm attempts to partition the matrix until the individual submatrices occupy approximately the amount of memory contained in the
CPU
caches.
The number of cores/thread available also plays a role.
The following picture shows the symmetric matrix
audikw_1 (943695 rows, 39297771 nonzeroes,
obtained from the University of Florida Sparse Matrix Collection
) partitioned for different numbers of executing threads (respectively 1,2,4 and 8).
Because of the symmetry, the upper triangle representation has been omitted.
The memory layout of the individual sparse submatrices follows a Z (or Morton) ordering.
Each submatrix is stored either in coordinate or CSR (Compressed Sparse Rows) format, and the numerical index type size is chosen in order to minimize memory consumption.
This layout enables efficient and multithreaded algorithms implementing the Sparse BLAS operations.
The algorithms and the performance of RSB in practice are described in a series of papers; give a look at the
papers
or the performance
sections.
See either autotuned results
for version 1.2 or
older 1.1 or 1.0 on the same machine.
Version 1.2 is a major one: it brings improvements in multivector processing and autotuning.
Autotuning results for librsb1.2
The following figure shows results from a poster presented at the workshop "Sparse Solvers for Exascale: From Building Blocks to Applications" in Greifswald, Germany, on March 2325, 2015.
There are: average speedup values from tuning and over Intel MKL are shown for symmetric, general transposed, general untransposed, as well as the autotuning cost.
Without the autotuning, the relative improvement with respect to MKL shall be rescaled down by the autotuning improvement (left column).
The full poster is available here:http://home.rzg.mpg.de/~mima/exascale15postera4.pdf.
Results obtained on the Intel Sandy Bridge (2 x Intel Xeon E52680); autotuned versus untuned results, and versus the Intel Math Kernel Library
(MKL)'s 11.05 Compressed Sparse Rows (CSR) implementation, using the best picked threads and structure with use of the new autotuning feature.
These results are computed on large matrices publicly available on the University of Florida Sparse Matrix Collection.
The points indicate speedup (old time to new time ratio) for the different matrixvector multiplication variants for different matrices and numerical types.
These results suggest that librsb
is competitive with the Intel MKL library implementation (mkl_dcsrmv, mkl_scsrmv, mkl_zcsrmv, mkl_ccsrmv
) of the CSR, especially on the symmetric and transposed matrixvector multiplication variants.
The (very cheap  it costs the time of a few multiplications) best threads autotuning strategy has been applied to MKL and RSB; here, ratio of RSB speed to MKL's.
Autotuning Results for librsb1.1, unsymmetric, all BLAS types
Untransposed (N) and transposed (T) multiplication results for both threads autotuning and structure autotuning.
Please note that the experiment with matrix "arabic" with type Z did not fit in the memory so the corresponding result is missing.
Autotuning Results for librsb1.1, symmetric, all BLAS types
Speedup to base RSB after either threads and structure autotuning. Structure autotuning by definition yields better results, hence no legend is necessary in the following.
Results obtained on the Intel Sandy Bridge (2 x Intel Xeon E52680), versus Intel Math Kernel Library
(MKL)'s 10.37 Compressed Sparse Rows (CSR) implementation, using 16 threads.
These results are computed on large matrices publicly available on the University of Florida Sparse Matrix Collection.
The bars in the graphs are proportional to the performance of the different matrixvector multiplication variants for different matrices.
The "Flops/sec" measure unit we use is defined as the ratio of the nonzeroes count of each matrix by a single operation time.
Please note how different matrices exhibit different performance because of their nonzeroes pattern (structure) impacting on the multiplication algorithm performance.
These results suggest that librsb
is competitive with the Intel MKL library implementation (mkl_dcsrmv
) of the CSR, especially on the symmetric and transposed matrixvector multiplication variants.
Symmetric Sparse MatrixVector Multiply
Unsymmetric Transposed Sparse MatrixVector Multiply
Unsymmetric Untransposed Sparse MatrixVector Multiply

The current release of the code (version 1.2.0) is available on: Sourceforge.
Reference Documentation and README
The complete documentation in HTML format for the latest released version is available here.
The same documentation is included in the
sources archive
and provides several examples for starting using librsb
.
Building the librsb
library, C headers, documentation
There are many options to build a customized library (see documentation).
However, building with reasonable defaults can be as simple as:
# unpack the sources archive
./configure help # there are many options available
./configure enablefortranmoduleinstall enablematrixtypes="double, double complex" \
CC=gcc FC=gfortran CFLAGS=O3 FCFLAGS=O3 prefix=$HOME/local
# If using Intel's compiler 'icc', use 'LD=xild'; e.g.: configure LD=xild CC=icc FC=ifort ...
# With GCC, you may get faster code with "march=native mfpmath=sse msse2"...
#
# Don't ignore configure's output diagnostics.
#
make cleanall # cleans up generated code
make # builds the library and examples
make qtests # quick consistency test
make tests # long consistency test
make install # install to $HOME/local
Please note that if you intend to link librsb
to shared libraries (e.g.: in case of sparsersb
's sparsersb.oct
) then in the above you should specify flags to build position indipendent code, e.g.:
CFLAGS="O3 fPIC" FCFLAGS="O3 fPIC"
.
If you want to have a debug build with error verbosity activated, you may configure with something like:
./configure enableallocatorwrapper enableinterfaceerrorverbosity=2 \
enableinternalserrorverbosity=1 CFLAGS="O0 ggdb" FCFLAGS="O0 ggdb" #...
Building the sparsersb
plugin for GNU Octave
Given a working GNU Octave installation (complete with development programs), one may proceed building the
sparsersb
plugin:
# be sure you compiled librsb with the position independent code flag (fPIC)
# be sure to have a fairly modern Octave version (e.g.: 3.6.3)
# 'export' the code from octaveforge (or 'checkout'):
svn export http://svn.code.sf.net/p/octave/code/trunk/octaveforge/main/sparsersb/ sparsersb
#
# Alternative 1: with pkg
#
# make an archive:
tar cvzf sparsersb1.0.0.tar.gz sparsersb
# invoke octave and build with installed librsb path and compile options:
CXXFLAGS=O3 \
RSBINCDIR=`$HOME/local/bin/librsbconfig prefix`/include \
RSBLIBDIR=`$HOME/local/bin/librsbconfig libdir` \
octave
# inside the octave prompt:
pkg install verbose "sparsersb1.0.0.tar.gz"
# this is it!
pkg load sparsersb
help pkg
help sparsersb
#
# Alternative 2: without pkg
#
cd sparsersb/src # step in the build directory
sh autogen.sh # builds the configure (you need automake, autoconf, ... for this)
./configure CXXFLAGS=O3 LIBRSB_CONFIG=$HOME/local/bin/librsbconfig
make # builds sparsersb.oct
make tests # consistency test
octave # the "sparsersb" keyword is now available in GNU Octave
The sparsersb plugin
for
GNU Octave
allows you to manipulate sparse matrices using librsb
without the need of programming in C or Fortran.
You can reuse any of your GNU Octave programs operating on sparse matrices, and instead of creating matrices using the "sparse
" keyword, use "sparsersb
".
Example:
octave:1> R=(rand(3)>.6)
R =
0 0 0
0 0 0
1 0 1
octave:2> A_octave=sparse(R)
A_octave =
Compressed Column Sparse (rows = 3, cols = 3, nnz = 2 [22%])
(3, 1) > 1
(3, 3) > 1
octave:3> A_librsb=sparsersb(R)
A_librsb =
Recursive Sparse Blocks (rows = 3, cols = 3, nnz = 2 [22%])
(3, 1) > 1
(3, 3) > 1
octave:4> help sparsersb
...
Matrices created in this way support nearly all the operations supported by native GNU Octave ones
(see the GNU Octave's Sparse Matrices documentation.
On some operations, conversion to Octave's formats may occur.
See the build instructions section on how to build
sparsersb
.
Let us consider a simple program creating a sparse matrix and multiplying it by a vector.
Create a source code program file (say, myrsb.c
) with a text editor:
#include <rsb.h> /* for rsb_lib_init */
#include <blas_sparse.h> /* Sparse BLAS on the top of librsb */
int main(const int argc, char * const argv[])
{
blas_sparse_matrix matrix = blas_invalid_handle; /* handle for A */
const int nnzA=4; /* number of nonzeroes of matrix A */
const int nrA=3,ncA=3; /* number of A's rows and columns */
const int incX=1,incB=1; /* spacing of X, B entries */
int IA[]={ 0, 1, 2, 2 }; /* row indices*/
int JA[]={ 0, 1, 0, 2 }; /* column indices*/
double VA[]={ 11.0, 22.0, 13.0, 33.0 }; /* coefficients */
double X[]={ 0.0, 0.0, 0.0 };
double B[]={ 1.0, 2.0, 2.0 };
rsb_err_t errval = RSB_ERR_NO_ERROR;
if((errval = rsb_lib_init(RSB_NULL_INIT_OPTIONS)) != RSB_ERR_NO_ERROR) return 1;
matrix = BLAS_duscr_begin(nrA,ncA); /* begin matrix creation */
if(matrix==blas_invalid_handle) return 1; /* check */
if( BLAS_ussp(matrix,blas_lower_symmetric) != 0 ) return 1; /* set symmetry property */
if( BLAS_duscr_insert_entries(matrix, nnzA, VA, IA, JA) != 0 ) return 1; /* insert some nonzeroes */
if( BLAS_duscr_end(matrix) == blas_invalid_handle ) return 1; /* end creation */
if( BLAS_dusmv(blas_no_trans,1,matrix,B,incB,X,incX)) return 1; /* X := X + (1) * A * B */
if( BLAS_usds(matrix)) return 1; /* destroy matrix */
if((errval = rsb_lib_exit(RSB_NULL_EXIT_OPTIONS)) != RSB_ERR_NO_ERROR) return 1;
return 0;
}
then compile, build, execute:
export PATH=$PATH:$HOME/local/librsb/bin/
gcc `librsbconfig I_opts` c myrsb.c
gcc o myrsb myrsb.o `librsbconfig static ldflags extra_libs`
# you may have to add OpenMP flags here above (e.g.: openmp or fopenmp)
./myrsb
See the reference documentation for more complete examples.
The same program
(say, myrsb.F90
)
in Fortran:
PROGRAM MAIN
USE rsb ! Fortran interface to C's rsb.h
USE blas_sparse ! Fortran interface librsb's Sparse BLAS
IMPLICIT NONE
INTEGER :: istat = 0 ! error status variable
INTEGER :: A ! BLAS sparse matrix descriptor
INTEGER :: transt=blas_no_trans ! no transposition
INTEGER :: incX=1, incB=1 ! X, B vectors increment
REAL(KIND=8) :: alpha=3
INTEGER :: nnzA=4, nrA=3, ncA=3 ! nonzeroes, rows, columns of matrix A
INTEGER :: IA(4)=(/1, 2, 3, 3/) ! row indices
INTEGER :: JA(4)=(/1, 2, 1, 3/) ! column indices
REAL(KIND=8) :: VA(4)=(/11.0, 22.0, 13.0, 33.0/) ! coefficients
REAL(KIND=8) :: X(3)=(/ 0, 0, 0/)
REAL(KIND=8) :: B(3)=(/1.0, 2.0, 2.0/)
TYPE(C_PTR),PARAMETER :: EO = C_NULL_PTR !
istat = rsb_lib_init(EO) ! librsb initialization
IF(istat.NE.0)STOP ! check
! the 'd' in 'duscr_begin' specifies 'double precision' type
CALL duscr_begin(nrA,ncA,A,istat) ! begin matrix creation
! in the following calls the type is implicit due to the arguments
CALL ussp(A,blas_lower_symmetric,istat) ! set symmetry property
CALL uscr_insert_entries(A,nnzA,VA,IA,JA,istat) ! insert some nonzeroes
CALL uscr_end(A,istat) ! end matrix creation
CALL usmv(transt,alpha,A,B,incB,X,incX,istat) ! X := X + (3) * A * B
! please note the 'usds' is type independent
CALL usds(A,istat) ! destroy matrix
istat = rsb_lib_exit(EO) ! librsb finalization
END PROGRAM MAIN
Note how default indexing in Fortran is 1based.
To compile, build, execute:
export PATH=$PATH:$HOME/local/librsb/bin/
gfortran `librsbconfig I_opts` c myrsb.F90
gfortran o myrsb myrsb.o `librsbconfig static ldflags extra_libs`
# you may have to add OpenMP flags here above (e.g.: openmp or fopenmp)
./myrsb
The same program
can be rewritten using the native RSB interface via the ISO C Binding feature:
PROGRAM MAIN
USE rsb
USE ISO_C_BINDING
IMPLICIT NONE
INTEGER,TARGET :: errval
INTEGER :: transt = RSB_TRANSPOSITION_N ! no transposition
INTEGER :: incX = 1, incB = 1 ! X, B vectors increment
REAL(KIND=8),TARGET :: alpha = 3, beta = 1
INTEGER :: nnzA = 4, nrA = 3, ncA = 3 ! nonzeroes, rows, columns of matrix A
INTEGER,TARGET :: IA(4) = (/1, 2, 3, 3/) ! row indices
INTEGER,TARGET :: JA(4) = (/1, 2, 1, 3/) ! column indices
INTEGER :: typecode = RSB_NUMERICAL_TYPE_DOUBLE
INTEGER :: flags = RSB_FLAG_DEFAULT_MATRIX_FLAGS+RSB_FLAG_SYMMETRIC
REAL(KIND=8),TARGET :: VA(4) = (/11.0, 22.0, 13.0, 33.0/) ! coefficients
REAL(KIND=8),TARGET :: X(3) = (/ 0, 0, 0/)
REAL(KIND=8),TARGET :: B(3) = (/1.0, 2.0, 2.0/)
TYPE(C_PTR),TARGET :: mtxAp=C_NULL_PTR
TYPE(C_PTR) :: mtxApp = C_NULL_PTR
errval = rsb_lib_init(C_NULL_PTR) ! librsb initialization
IF(errval.NE.0) STOP "error calling rsb_lib_init"
mtxAp = rsb_mtx_alloc_from_coo_begin(nnzA,typecode,nrA,ncA,flags,C_LOC(errval)) ! begin matrix creation
errval = rsb_mtx_set_vals(mtxAp,C_LOC(VA),IA,JA,nnzA,flags) ! insert some nonzeroes
mtxApp = C_LOC(mtxAp) ! create a matrix pointer pointer
IF(errval.NE.0) STOP "error calling rsb_mtx_set_vals"
errval = rsb_mtx_alloc_from_coo_end(mtxApp) ! end matrix creation
IF(errval.NE.0) STOP "error calling rsb_mtx_alloc_from_coo_end"
errval = rsb_spmv(transt,C_LOC(alpha),mtxAp,C_LOC(X),incX,C_LOC(beta),C_LOC(B),incB) ! X:=X+(3)*A*B
IF(errval.NE.0) STOP "error calling rsb_spmv"
mtxAp = rsb_mtx_free(mtxAp) ! destroy matrix
IF(errval.NE.0) STOP "error calling rsb_mtx_free"
errval = rsb_lib_exit(RSB_NULL_EXIT_OPTIONS) ! librsb finalization
IF(errval.NE.0) STOP "error calling rsb_lib_exit"
PRINT *, "rsb module Fortran test is ok"
END PROGRAM MAIN
And linked the usual way.
Just make sure you enabled the Fortran headers installation at configure time.
The advantage of using this interface is that it matches exactly the C version; this applies to the documentation as well.
For a summary of user visible changes look in the
NEWS.TXT file,
librsb
is copyright (C) 20082015 Michele Martone
and is licensed under the Lesser GNU Public License version 3 (LGPLv3) or later.
You find a
FAQ
about the
GPL
here on the GNU website,
while
here on the FSF website you find a discussion specific to the
GPLv3
.
Mailing list is at: http://sourceforge.net/p/librsb/mailman/.

Michele Martone ( michelemartone
AT
users
DOT
sourceforge
DOT
net )
Please make sure to specify "librsb" in the "Subject:" line of your emails pertaining the library.

On the
WWW:
http://home.rzg.mpg.de/~mima/.

Remember that not only bug reports, but also positive feedback, e.g. writing about how does
librsb
improve your work/research (and which type of problems) is appreciated!