librsb
is a library for sparse matrix computations featuring the Recursive Sparse Blocks
(RSB)
matrix format. This format allows cache efficient and multithreaded
(that is, shared memory parallel) operations on large sparse matrices.
The most common operations necessary to iterative solvers are available, e.g.:
matrixvector multiplication, triangular solution, rows/columns scaling,
diagonal extraction / setting, blocks extraction, norm computation, formats
conversion. The RSB format is especially well suited for symmetric and
transposed multiplication variants.
Most numerical kernels code is auto generated, and the supported numerical
types can be chosen by the user at build time.
librsb
implements the Sparse BLAS standard, as specified in the [BLAS Forum] documents.
You might be interested in:
The RSB format is hierarchical.
Its basic idea is a recursive subdivision of a matrix in the two dimensions.
The subdivision algorithm attempts to partition the matrix until the individual submatrices occupy approximately the amount of memory contained in the
CPU
caches.
The number of cores/thread available also plays a role.
The following picture shows the symmetric matrix
audikw_1 (943695 rows, 39297771 nonzeroes,
obtained from the University of Florida Sparse Matrix Collection
) partitioned for different numbers of executing threads (respectively 1,2,4 and 8).
Because of the symmetry, the upper triangle representation has been omitted.
The memory layout of the individual sparse submatrices follows a Z (or Morton) ordering.
Each submatrix is stored either in coordinate or CSR (Compressed Sparse Rows) format, and the numerical index type size is chosen in order to minimize memory consumption.
This layout enables efficient and multithreaded algorithms implementing the Sparse BLAS operations.
The algorithms and the performance of RSB in practice are described in a series of papers; give a look at the
papers
or the performance
sections.
See either autotuned results
for version 1.2 or
older 1.1 or 1.0 on the same machine.
Version 1.2 is a major one: it brings improvements in multivector processing and autotuning.
Autotuning results for librsb1.2
The following figure shows results from a poster presented at the workshop "Sparse Solvers for Exascale: From Building Blocks to Applications" in Greifswald, Germany, on March 2325, 2015.
There are: average speedup values from tuning and over Intel MKL are shown for symmetric, general transposed, general untransposed, as well as the autotuning cost.
Without the autotuning, the relative improvement with respect to MKL shall be rescaled down by the autotuning improvement (left column).
The full poster is available here:http://home.mpcdf.mpg.de/~mima/exascale15postera4.pdf.
Results obtained on the Intel Sandy Bridge (2 x Intel Xeon E52680); autotuned versus untuned results, and versus the Intel Math Kernel Library
(MKL)'s 11.05 Compressed Sparse Rows (CSR) implementation, using the best picked threads and structure with use of the new autotuning feature.
These results are computed on large matrices publicly available on the University of Florida Sparse Matrix Collection.
The points indicate speedup (old time to new time ratio) for the different matrixvector multiplication variants for different matrices and numerical types.
These results suggest that librsb
is competitive with the Intel MKL library implementation (mkl_dcsrmv, mkl_scsrmv, mkl_zcsrmv, mkl_ccsrmv
) of the CSR, especially on the symmetric and transposed matrixvector multiplication variants.
The (very cheap  it costs the time of a few multiplications) best threads autotuning strategy has been applied to MKL and RSB; here, ratio of RSB speed to MKL's.
Autotuning Results for librsb1.1, unsymmetric, all BLAS types
Untransposed (N) and transposed (T) multiplication results for both threads autotuning and structure autotuning.
Please note that the experiment with matrix "arabic" with type Z did not fit in the memory so the corresponding result is missing.
Autotuning Results for librsb1.1, symmetric, all BLAS types
Speedup to base RSB after either threads and structure autotuning. Structure autotuning by definition yields better results, hence no legend is necessary in the following.
Results obtained on the Intel Sandy Bridge (2 x Intel Xeon E52680), versus Intel Math Kernel Library
(MKL)'s 10.37 Compressed Sparse Rows (CSR) implementation, using 16 threads.
These results are computed on large matrices publicly available on the University of Florida Sparse Matrix Collection.
The bars in the graphs are proportional to the performance of the different matrixvector multiplication variants for different matrices.
The "Flops/sec" measure unit we use is defined as the ratio of the nonzeroes count of each matrix by a single operation time.
Please note how different matrices exhibit different performance because of their nonzeroes pattern (structure) impacting on the multiplication algorithm performance.
These results suggest that librsb
is competitive with the Intel MKL library implementation (mkl_dcsrmv
) of the CSR, especially on the symmetric and transposed matrixvector multiplication variants.
Symmetric Sparse MatrixVector Multiply
Unsymmetric Transposed Sparse MatrixVector Multiply
Unsymmetric Untransposed Sparse MatrixVector Multiply

The current release of the
librsb
source code (version 1.2.0) is available on:
Sourceforge (under project librsb).

The current release of the
sparsersb
source code (version 1.0.5) is available on:
Sourceforge (under project Octave).
The following distributions package librsb
, so you can install them using their specific method:
Reference Documentation and README
The complete documentation in HTML format for the latest released version is available here.
The same documentation is included in the
sources archive
and provides several examples for starting using librsb
.
librsb
is fast on large matrices;
these are matrices whose memory occupation exceeds the memory cache.
Nowadays this boundary is around a few million of nonzeroes.
You might find librsb
slower
than a good CSR/CSC serial implementation if your matrices are smaller
than that; but also much faster if larger than that.
Keep in mind that what contributes the RSB format in librsb
to be fast is:
symmetric matrices,
large matrices,
many right hand sides,
multiple cores.
Do not hesitate to
give feedback
about any performance related doubt!
Building the librsb
library, C headers, documentation
There are many options to build a customized library (see documentation).
However, building with reasonable defaults can be as simple as:
# unpack the sources archive
./configure help # there are many options available
./configure enablefortranmoduleinstall enablematrixtypes="double, double complex" \
CC=gcc FC=gfortran CFLAGS=O3 FCFLAGS=O3 prefix=$HOME/local
# If using Intel's compiler 'icc', use 'LD=xild'; e.g.: configure LD=xild CC=icc FC=ifort ...
# With GCC, you may get faster code with "march=native mfpmath=sse msse2"...
#
# Don't ignore configure's output diagnostics.
#
make cleanall # cleans up generated code
make # builds the library and examples
make qtests # quick consistency test
make tests # long consistency test
make install # install to $HOME/local
Please note that if you intend to link librsb
to shared libraries (e.g.: in case of sparsersb
's sparsersb.oct
) then in the above you should specify flags to build position indipendent code, e.g.:
CFLAGS="O3 fPIC" FCFLAGS="O3 fPIC"
.
If you want to have a debug build with error verbosity activated, you may configure with something like:
./configure enableallocatorwrapper enableinterfaceerrorverbosity=2 \
enableinternalserrorverbosity=1 CFLAGS="O0 ggdb" FCFLAGS="O0 ggdb" #...
Building the sparsersb
package for GNU Octave
Assume you have both of librsb
and a recent GNU Octave (complete with development programs) installed; then you can build sparsersb
by:
# Be sure you compiled librsb with the position independent code flag (fPIC).
# Browse to the sparsersb web page and get the sources archive, e.g.: sparsersb1.0.1.tar.gz
# Start Octave:
$ octave
# From Octave's prompt:
octave:1> more off; pkg local verbose install sparsersb1.0.1.tar.gz
...
octave:2> pkg load sparsersb
...
octave:3> test sparsersb
...
octave:4> help sparsersb
...
Alternatively, in the case you don't have a
librsb
installation at hand, you can have
sparsersb
use a
librsb
sources archive and built all in one go (linking statically):
# First, download by hand a librsb archive, say $HOME/librsb1.2.0.tar.gz
# Start Octave with environment variable LIBRSB_TARBALL set to the archive location:
$ LIBRSB_TARBALL=$HOME/librsb1.2.0.tar.gz octave
# From Octave's prompt:
octave:1> more off; pkg local verbose install sparsersb1.0.1.tar.gz
# this will take some more time
...
octave:2> pkg load sparsersb
...
octave:3> test sparsersb
...
octave:4> help sparsersb
...
Download instructions for both packages are here.
If you think any information here is wrong or out of date, consider reporting; see the
contact section for this.
The sparsersb package
for
GNU Octave
allows you to manipulate sparse matrices using librsb
without the need of programming in C or Fortran.
You can reuse your GNU Octave programs employing sparse matrices, and instead of creating matrices using the "sparse
" keyword, use "sparsersb
", like in this example:
octave:1> R=(rand(3)>.6)
R =
0 0 0
0 0 0
1 0 1
octave:2> A_octave=sparse(R)
A_octave =
Compressed Column Sparse (rows = 3, cols = 3, nnz = 2 [22%])
(3, 1) > 1
(3, 3) > 1
octave:3> A_librsb=sparsersb(R)
A_librsb =
Recursive Sparse Blocks (rows = 3, cols = 3, nnz = 2 [22%])
(3, 1) > 1
(3, 3) > 1
octave:4> test sparsersb
...
octave:5> help sparsersb
...
Matrices created in this way support transparently nearly all of the operations supported by native GNU Octave
sparse
(see the GNU Octave's Sparse Matrices documentation).
If you are using a multicore computer and your matrices are significantly large,
with
sparsersb
.
you shall get a much faster matrixvector multiplication than with Octave's
sparse
.
See the build instructions section on how to build
sparsersb
and the download section on where to get it.
Let us consider a simple program creating a sparse matrix and multiplying it by a vector.
Create a source code program file (say, myrsb.c
) with a text editor:
#include <rsb.h> /* for rsb_lib_init */
#include <blas_sparse.h> /* Sparse BLAS on the top of librsb */
int main(const int argc, char * const argv[])
{
blas_sparse_matrix matrix = blas_invalid_handle; /* handle for A */
const int nnzA=4; /* number of nonzeroes of matrix A */
const int nrA=3,ncA=3; /* number of A's rows and columns */
const int incX=1,incB=1; /* spacing of X, B entries */
int IA[]={ 0, 1, 2, 2 }; /* row indices*/
int JA[]={ 0, 1, 0, 2 }; /* column indices*/
double VA[]={ 11.0, 22.0, 13.0, 33.0 }; /* coefficients */
double X[]={ 0.0, 0.0, 0.0 };
double B[]={ 1.0, 2.0, 2.0 };
rsb_err_t errval = RSB_ERR_NO_ERROR;
if((errval = rsb_lib_init(RSB_NULL_INIT_OPTIONS)) != RSB_ERR_NO_ERROR) return 1;
matrix = BLAS_duscr_begin(nrA,ncA); /* begin matrix creation */
if(matrix==blas_invalid_handle) return 1; /* check */
if( BLAS_ussp(matrix,blas_lower_symmetric) != 0 ) return 1; /* set symmetry property */
if( BLAS_duscr_insert_entries(matrix, nnzA, VA, IA, JA) != 0 ) return 1; /* insert some nonzeroes */
if( BLAS_duscr_end(matrix) == blas_invalid_handle ) return 1; /* end creation */
if( BLAS_dusmv(blas_no_trans,1,matrix,B,incB,X,incX)) return 1; /* X := X + (1) * A * B */
if( BLAS_usds(matrix)) return 1; /* destroy matrix */
if((errval = rsb_lib_exit(RSB_NULL_EXIT_OPTIONS)) != RSB_ERR_NO_ERROR) return 1;
return 0;
}
then compile, build, execute:
export PATH=$PATH:$HOME/local/librsb/bin/
gcc `librsbconfig I_opts` c myrsb.c
gcc o myrsb myrsb.o `librsbconfig static ldflags extra_libs`
# you may have to add OpenMP flags here above (e.g.: openmp or fopenmp)
./myrsb
See the reference documentation for more complete examples.
The same program
(say, myrsb.F90
)
in Fortran:
PROGRAM MAIN
USE rsb ! Fortran interface to C's rsb.h
USE blas_sparse ! Fortran interface librsb's Sparse BLAS
IMPLICIT NONE
INTEGER :: istat = 0 ! error status variable
INTEGER :: A ! BLAS sparse matrix descriptor
INTEGER :: transt=blas_no_trans ! no transposition
INTEGER :: incX=1, incB=1 ! X, B vectors increment
REAL(KIND=8) :: alpha=3
INTEGER :: nnzA=4, nrA=3, ncA=3 ! nonzeroes, rows, columns of matrix A
INTEGER :: IA(4)=(/1, 2, 3, 3/) ! row indices
INTEGER :: JA(4)=(/1, 2, 1, 3/) ! column indices
REAL(KIND=8) :: VA(4)=(/11.0, 22.0, 13.0, 33.0/) ! coefficients
REAL(KIND=8) :: X(3)=(/ 0, 0, 0/)
REAL(KIND=8) :: B(3)=(/1.0, 2.0, 2.0/)
TYPE(C_PTR),PARAMETER :: EO = C_NULL_PTR !
istat = rsb_lib_init(EO) ! librsb initialization
IF(istat.NE.0)STOP ! check
! the 'd' in 'duscr_begin' specifies 'double precision' type
CALL duscr_begin(nrA,ncA,A,istat) ! begin matrix creation
! in the following calls the type is implicit due to the arguments
CALL ussp(A,blas_lower_symmetric,istat) ! set symmetry property
CALL uscr_insert_entries(A,nnzA,VA,IA,JA,istat) ! insert some nonzeroes
CALL uscr_end(A,istat) ! end matrix creation
CALL usmv(transt,alpha,A,B,incB,X,incX,istat) ! X := X + (3) * A * B
! please note the 'usds' is type independent
CALL usds(A,istat) ! destroy matrix
istat = rsb_lib_exit(EO) ! librsb finalization
END PROGRAM MAIN
Note how default indexing in Fortran is 1based.
To compile, build, execute:
export PATH=$PATH:$HOME/local/librsb/bin/
gfortran `librsbconfig I_opts` c myrsb.F90
gfortran o myrsb myrsb.o `librsbconfig static ldflags extra_libs`
# you may have to add OpenMP flags here above (e.g.: openmp or fopenmp)
./myrsb
The same program
can be rewritten using the native RSB interface via the ISO C Binding feature:
PROGRAM MAIN
USE rsb
USE ISO_C_BINDING
IMPLICIT NONE
INTEGER,TARGET :: errval
INTEGER :: transt = RSB_TRANSPOSITION_N ! no transposition
INTEGER :: incX = 1, incB = 1 ! X, B vectors increment
REAL(KIND=8),TARGET :: alpha = 3, beta = 1
INTEGER :: nnzA = 4, nrA = 3, ncA = 3 ! nonzeroes, rows, columns of matrix A
INTEGER,TARGET :: IA(4) = (/1, 2, 3, 3/) ! row indices
INTEGER,TARGET :: JA(4) = (/1, 2, 1, 3/) ! column indices
INTEGER :: typecode = RSB_NUMERICAL_TYPE_DOUBLE
INTEGER :: flags = RSB_FLAG_DEFAULT_MATRIX_FLAGS+RSB_FLAG_SYMMETRIC
REAL(KIND=8),TARGET :: VA(4) = (/11.0, 22.0, 13.0, 33.0/) ! coefficients
REAL(KIND=8),TARGET :: X(3) = (/ 0, 0, 0/)
REAL(KIND=8),TARGET :: B(3) = (/1.0, 2.0, 2.0/)
TYPE(C_PTR),TARGET :: mtxAp=C_NULL_PTR
TYPE(C_PTR) :: mtxApp = C_NULL_PTR
errval = rsb_lib_init(C_NULL_PTR) ! librsb initialization
IF(errval.NE.0) STOP "error calling rsb_lib_init"
mtxAp = rsb_mtx_alloc_from_coo_begin(nnzA,typecode,nrA,ncA,flags,C_LOC(errval)) ! begin matrix creation
errval = rsb_mtx_set_vals(mtxAp,C_LOC(VA),IA,JA,nnzA,flags) ! insert some nonzeroes
mtxApp = C_LOC(mtxAp) ! create a matrix pointer pointer
IF(errval.NE.0) STOP "error calling rsb_mtx_set_vals"
errval = rsb_mtx_alloc_from_coo_end(mtxApp) ! end matrix creation
IF(errval.NE.0) STOP "error calling rsb_mtx_alloc_from_coo_end"
errval = rsb_spmv(transt,C_LOC(alpha),mtxAp,C_LOC(X),incX,C_LOC(beta),C_LOC(B),incB) ! X:=X+(3)*A*B
IF(errval.NE.0) STOP "error calling rsb_spmv"
mtxAp = rsb_mtx_free(mtxAp) ! destroy matrix
IF(errval.NE.0) STOP "error calling rsb_mtx_free"
errval = rsb_lib_exit(RSB_NULL_EXIT_OPTIONS) ! librsb finalization
IF(errval.NE.0) STOP "error calling rsb_lib_exit"
PRINT *, "rsb module Fortran test is ok"
END PROGRAM MAIN
And linked the usual way.
Just make sure you enabled the Fortran headers installation at configure time.
The advantage of using this interface is that it matches exactly the C version; this applies to the documentation as well.
For a summary of user visible changes look in the
NEWS.TXT file,
librsb
is copyright (C) 20082017 Michele Martone
and is licensed under the Lesser GNU Public License version 3 (LGPLv3) or later.
You find a
FAQ
about the
GPL
here on the GNU website,
while
here on the FSF website you find a discussion specific to the
GPLv3
.
Mailing list is at: http://sourceforge.net/p/librsb/mailman/.

librsb
is not backed or financed by any organisation.
Parties interested in its further development or wishing to support it are welcome to contact me.

Remember that not only bug reports, but also positive feedback, e.g. writing about how does
librsb
improve your work/research (and which type of problems does it solve) is warmly appreciated!

Michele Martone ( michelemartone
AT
users
DOT
sourceforge
DOT
net )
Please make sure to specify "librsb" or "sparsersb" or "PyRSB" in the "Subject:" line of your emails.

On the
WWW:
http://home.mpcdf.mpg.de/~mima/.