librsb
1.2.0.11
|
A sparse matrix library implementing the `Recursive Sparse Blocks' (RSB) matrix storage.
This is the documentation for the application programming interface (API) of the `librsb'
library.
In order to use librsb
, there is no need for the user to know the RSB layout and algorithms: this documentation should be sufficient.
This library is dual-interfaced; it supports: a native (`RSB') interface (with identifiers prefixed by `rsb_' or `RSB_'), and a (mostly complete) Sparse BLAS interface, as a wrapper around the RSB interface.
Many computationally intensive operations are implemented with thread parallelism, by using OpenMP.
Thread parallelism can be turned off at configure time, if desired, or limited at execution time.
Many of the computational kernels source code files (mostly internals) were automatically generated.
This user documentation concerns the end user API only; that is, neither the internals, nor the code generator.
You should consult the remaining documentation (e.g. the README file, code comments) to find information about how to modify the generator or the library internals.
This library is research software and as such, still experimental. For a first approach, we suggest to go through the Example programs and code documentation section, or the quick start examples section on this page.
Information about the supported matrix types and matrix operations resides in the rsb_types.h file.
A C/C++ user can use the native API of RSB by including the rsb.h header. The same interface is available in Fortran via the ISO C Binding interface, specified in rsb.F90.
The C header file for the The Sparse BLAS interface to librsb (blas_sparse.h, rsb_blas_sparse.F90) is blas_sparse.h.
Contents of the README file :
================================================================================ librsb README file ================================================================================ librsb - Recursive Sparse Blocks Matrix computations library A library for sparse matrix computations featuring the Recursive Sparse Blocks (RSB) matrix format. This format allows cache efficient and multi-threaded (that is, shared memory parallel) operations on large sparse matrices. It provides the most common operations necessary to iterative solvers, like matrix-vector multiplication, triangular solution, rows/columns scaling, diagonal extraction / setting, blocks extraction, norm computation, formats conversion. The RSB format is especially well suited for symmetric and transposed multiplication variants. Most of numerical kernels code is auto generated, and the supported numerical types can be chosen by the user at build time. This library is dual-interfaced: it can be used via the native (`RSB') interface (with identifiers prefixed by `rsb_' or `RSB_'), and a Sparse BLAS one (`BLAS_'). The `RSB' interface can be used from C (rsb.h header) or via modern Fortran ISO-C-BINDING ("rsb" module in rsb.F90). The Sparse BLAS interface is usable from C via the blas_sparse.h header, and from Fortran via the "blas_sparse" module. ================================================================================ This (README) is the first document you should read about librsb. It contains basic instructions to generate, compile, install, and use librsb. The reference documentation for programming with librsb is contained in the ./doc/ source package subdirectory and when installed, placed in the appropriate system directories as both Unix man pages (./doc/man/) and HTML (./doc/html/). If you are a user of a previous version of librsb, see the NEWS file listing the changes. After having read this file you are welcome to ask questions to the author. -------------------------------------------------------------------------------- INTRODUCTION -------------------------------------------------------------------------------- librsb is a library for sparse matrix algebra computations. It is stand-alone: does not require any other library to build or work. It is shared memory parallel, using the OpenMP standard. It focuses on high performance and provides configure-time build options. A part of the library code is automatically generated from templates and macros, on the basis of the numerical types a user wishes to have supported. The configure script options (self documented --- not documented here) provide many build time options, especially with respect to debug and additional verbosity. INTRODUCTION MAIN ASPECTS,FEATURES QUICK INSTALL AND TESTING LIBRARY CONFIGURATION, GENERATION, BUILD INSTALLATION, USAGE EXECUTION AND ENVIRONMENT VARIABLES DOCUMENTATION, EXAMPLES AND PROGRAMMING GUIDELINES CONFIGURE, BUILD AND BENCHMARK EXAMPLE COMPATIBILITY FAQ POSSIBLE / POTENTIAL FUTURE FEATURES / ENHANCEMENTS ABOUT THE INTERNALS BUGS CONTACTS CREDITS LICENSE -------------------------------------------------------------------------------- MAIN ASPECTS,FEATURES -------------------------------------------------------------------------------- * very efficient (see the website for benchmark performance results) * threads/structure autotuning feature for additional performance * support for multiple numerical data types which can be turned on/off individually (e.g.:double, float, int, char, complex, double complex) at configure time * a Sparse BLAS interface for matrix assembly, computation, destruction * a code generator for its inner CSR, COO computational kernels * based on a recursive memory layout of submatrices * enough functionality to implement the most common iterative methods * basic input sanitizing (index types overflow checks, etc) * parallel matrix assembly and conversion routines * auxiliary functions for matrix I/O (using the "Matrix Market" format: real, integer, complex and pattern are supported) * implemented as a building block for solvers like e.g. PSBLAS * dual implementation of kernels: with "full word" and "half word" indices * thread level (shared memory) parallelism by using OpenMP * basic (unoptimized) sparse matrices multiplication and summation * interactive usage possible by using the "sparsersb" plugin for GNU Octave * complete with examples and a test suite * see the NEWS text file for a list of changes in each release -------------------------------------------------------------------------------- QUICK INSTALL AND TESTING EXAMPLE -------------------------------------------------------------------------------- # unpack the archives or get them from the repositories ./autogen.sh # only necessary if configure file does not exist ./configure --prefix=$HOME/local/librsb/ # see also ./configure --help for many other options # librsb has been configured make help # provide information make # build the library and test programs # librsb has been built make qtests # perform brief sanity tests make qqtests # the same, but with less output make tests # perform extended sanity tests ls examples/*.c # editable examples; build them with 'make' ls examples/*.F90 # editable examples; build them with 'make' make install # install to $HOME/local/librsb/ # librsb has been installed and can be used # for instance, try using one of the librsb examples as a model: mkdir -p ~/rsb-test/ && cp examples/hello.c ~/rsb-test/myrsb.c # adapt hello.c to your needs and recompile: cd ~/rsb-test/ export PATH=$PATH:$HOME/local/librsb/bin/ gcc `librsb-config --I_opts`. -c myrsb.c gcc -o myrsb myrsb.o `librsb-config --static --ldflags --extra_libs` ./myrsb # run your program -------------------------------------------------------------------------------- LIBRARY CONFIGURATION, GENERATION, BUILD -------------------------------------------------------------------------------- This library consists of C code (C 99), partially generated by M4 macros. The user wishing to build librsb can specify different initial parameters determining the supported matrix operations, inner explicit loop unrolling factors, available numerical data types and code variations. These parameters have to be specified to the ./configure script. The M4 macros are used at build time to generate specialized C code. If building from repository sources, an M4 preprocessor is required. Otherwise, it is necessary only when specifying ./configure options affecting code generation (see ./configure --help). The M4 preprocessor executable can be specified explicitly to ./configure with the M4 environment variable or via the --with-m4 option. After invoking ./configure and before running 'make' it is possible to invoke 'make cleanall' to make sure that auto-generated code is deleted first. At configure time, it is very important that the configure script is able to detect the system cache memory hierarchy parameters. In the case it fails, you are encouraged to specify cache parameters by re-running ./configure and setting the --with-memhinfo option. For instance: --with-memhinfo=L2:4/64/512K,L1:8/64/24K These values need not be exact: they can be approximate. Yet they may be critical to library performance; for this reason you are allowed to override this default in a variety of ways. Read further to get a description of the memory hierarchy info string format. If you want to build Fortran examples, be sure of invoking ./configure with the --enable-fortran-examples option. You can specify the desired Fortran compiler and compilation flags via the FC and FCFLAGS variables. Set the CPPFLAGS variable at configure time to provide additional compilation flags; e.g. configure to detect necessary headers in non-standard location. Similarly, the LDFLAGS variable can be set to contain link time options; so you can use it to specify libraries to be linked to librsb examples. Invoke ./configure --help for details of other relevant environment variables. After ./configure you will see information about the current build options and if satisfied, invoke 'make' to build the library and the examples. To check for library consistence, run: make qtests # takes a short time, spots most problems or make tests # takes longer If these tests terminate with an error code, it may be that it has been caused by a bug in librsb, so please tell us (see BUGS). -------------------------------------------------------------------------------- INSTALLATION, USAGE -------------------------------------------------------------------------------- Once built, the library can be installed with: sudo make install #'make install' installs the library system-wide This installs header files, binary library files, and the librsb-config program. Then, application C programs should include the rsb.h header file with #include <rsb.h> and be compiled using include options as generated by the output of `librsb-config --I_opts`. To link to the librsb.a static library file and its dependencies one can use the output of `librsb-config --static --ldflags --extra_libs`. If you wish to use the library without installing it in the system directories, make sure to include the <rsb.h> header file and link to the librsb.a library and all the necessary additional libraries. Users of pkg-config can manually copy the librsb.pc file to the appropriate directory to use pkg-config in a way similar to librsb-config. -------------------------------------------------------------------------------- EXECUTION AND ENVIRONMENT VARIABLES -------------------------------------------------------------------------------- By default, the only environment variable read by librsb is RSB_USER_SET_MEM_HIERARCHY_INFO, and will override configure-time and auto-detected settings about memory hierarchy. Its value is specified as n concatenated strings of the form: L<l>:<a_l>/<b_l>/<c_l> These strings are separated by a comma (","), and each of them is made up from substrings where: <n> is the cache memories hierarchy height, from 1 upwards. <l> is the cache level, from 1 upwards. <a_l> is the cache associativity <b_l> is the cache block size (cache line length) <c_l> is the cache capacity (size) The <a_l>, <b_l>, <c_l> substrings consist of an integer number with an optional multiplier character among {K,M,G} (to specify respectively 2^10, 2^20 or 2^30). Any value is permitted, a long as it is positive. Higher level cache capacities are required to be larger than lower level ones. Example strings and usage in the BASH shell: RSB_USER_SET_MEM_HIERARCHY_INFO="L2:4/64/512K,L1:8/64/32K" <your program> RSB_USER_SET_MEM_HIERARCHY_INFO="L1:8/128/2M" <your program> You may explicitly set this environment variable to fine-tune the library operation. If not doing so, runtime detection will be attempted; if this shall fail, a configure time detected value will be used. In some cases the configure time detection fails (e.g.: on very recent systems); this is not a fault of librsb but rather of the underlying environment. A default value for this memory hierarchy info string can be set at configure time by using the --with-memhinfo configure option. If you don't know values for these parameters, you can run the ./scripts/linux-sys-cache.sh script to try to get a guess on a Linux system. On other systems, please consult the available documentation. E.g.: On Mac OS 10.6 it is possible to get this information by invoking "sysctl -a | grep cache". The librsb library achieves parallelism by using OpenMP. Even though librsb does not directly read any OpenMP environment variable, it is still affected by them (e.g. the OMP_NUM_THREADS environment variable specifying the number of parallel threads). Please consult your compiler's OpenMP implementation documentation for more information. -------------------------------------------------------------------------------- DOCUMENTATION, EXAMPLES AND PROGRAMMING GUIDELINES -------------------------------------------------------------------------------- The API is entirely specified in the <rsb.h> header file. This is the only header file the application developer should ever include to use the library. The complete API documentation is generated by the doxygen tool in the doc directory in both HTML and man formats, and gets installed with 'make install'. If you wish not to use doxygen (or don't have it) you can skip documentation generation by adding the "DOXYGEN=false" argument to ./configure . There are a number of working example programs in the "examples" directory. The library only declares symbols prefixed by `rsb_'. These symbols include those declared in rsb.h, as well as internal, undocumented service functions and variables. Therefore, to avoid name clashes, you should avoid declaring `rsb_' prefixed identifiers in programs using librsb. If configure has been invoked with the --enable-sparse-blas-interface, then the corresponding `BLAS_' and `blas_' prefixed symbols will also be built. If after building the library, you find that it exports symbols with different prefixes (besides the system specific, compiler-generated symbols), please report this to us -- it is a bug. -------------------------------------------------------------------------------- CONFIGURE, BUILD AND BENCHMARK EXAMPLE -------------------------------------------------------------------------------- First configure and build with reasonable options, such as (gcc, 64 bit): export MKLROOT=/opt/intel/mkl ./configure --disable-debug CC=gcc FC=gfortran CFLAGS='-Ofast' \ --with-mkl="-static -L${MKLROOT}/lib/intel64 \ -Wl,--start-group,-lmkl_intel_lp64,-lmkl_gnu_thread,-lmkl_core,--end-group \ -fopenmp -lpthread" \ --with-memhinfo=L2:4/64/512K,L1:8/64/24K \ --with-mkl-include=/opt/intel/mkl/include/ \ --prefix=/opt/librsb-optimized/ \ --enable-matrix-types="double,double complex" Or (icc, 64 bit): export MKLROOT=/opt/intel/mkl ./configure --disable-debug CC=icc FC=ifort CFLAGS='-Ofast' \ --with-mkl="-static -L${MKLROOT}/lib/intel64 -openmp -lpthread \ -Wl,--start-group,-lmkl_intel_lp64,-lmkl_intel_thread,-lmkl_core,--end-group" \ --with-memhinfo=L2:4/64/512K,L1:8/64/24K \ --with-mkl-include=/opt/intel/mkl/include/ \ --prefix=/opt/librsb-optimized/ \ --enable-matrix-types="double,double complex" or (32 bit): ./configure --disable-debug CC=gcc FC=gfortran CFLAGS='-Ofast' \ --with-memhinfo=L2:4/64/512K,L1:8/64/24K \ --with-mkl="-static -L/opt/intel/mkl/lib/ia32/ -lmkl_solver \ -Wl,--start-group,-lmkl_intel,-lmkl_gnu_thread,-lmkl_core,--end-group \ -fopenmp -lpthread" \ --with-mkl-include=/opt/intel/mkl/include/ \ --prefix=/opt/librsb-optimized/ \ --enable-matrix-types="double,double complex" and then make # builds library and test programs make tests # optional In the above example, optional use of the MKL library is configured in. However, librsb does not use MKL in any way: it is only used by the "rsbench" test program. Say you want to quickly benchmark the library for a quick SPMV speed test. You have a valid Matrix Market file containing a matrix, A.mtx, and you want to benchmark librsb with it on 1 and 4 cores, performing 100 sparse matrix-vector multiply iterations. Then do a serial test first: ./rsbench -oa -Ob -f A.mtx -qH -R -n1 -t100 --verbose and then a parallel test: OMP_NUM_THREADS=4 ./rsbench -oa -Ob -f A.mtx -qH -R -n1,4 -t100 --verbose You can add option --compare-competitors to enable comparisons to the MKL, provided it has been configured in. If not specifying a type (argument to the -T option), the default will be used. If configured in at build time, choices may be -T D (where D is the BLAS prefix for "double"), -T Z (Z stands for "double complex") and so on. You can specify "-T :" to mean all of the configured types. Output of 'rsbench' shall be easy to understand or parse. For more options and configure information, invoke: ./rsbench --help To get the built in defaults, invoke the following: ./rsbench -oa -Ob --help ./rsbench --help ./rsbench --version ./rsbench -I ./rsbench -C An example Matrix Market matrix file contents: %%MatrixMarket matrix coordinate pattern general % This is a comment. % See other examples in the distributed *.mtx files. 2 2 3 1 1 2 1 2 2 -------------------------------------------------------------------------------- COMPATIBILITY -------------------------------------------------------------------------------- This library has been built and tested on Unix machines. Microsoft Windows users might try building librsb under the Cygwin environment. Some tricks may have to be used on IBM AIX. For instance, adding the --without-xdr or the --without-zlib switch to ./configure. Your mileage may vary. AIX's "make" program may give problems; use the GNU version "gmake" instead; the same shall be done with the M4 interpreter. This library was developed mostly on Debian Linux and using only free software. -------------------------------------------------------------------------------- FAQ -------------------------------------------------------------------------------- Q: Can you provide me good configure defaults for an optimized build ? A: Default './configure' options are appropriate for an optimized build. A good starting point for gcc is ./configure CC=gcc CFLAGS='-O3'. However, if you need complex arithmetic and are using GCC, I'd advise using -Ofast. On many versions of GCC I observed sub-optimal complex arithmetic performance with -O3, regardless use of e.g. -mtune=native. For more, consult your compiler documentation (e.g. man gcc, man icc), and learn about the best flags for your specific platform. Striping your executable (make install-strip for librsb's rsbench) may help. Q: I am a beginner and I wish librsb to be very verbose when I invoke library interface functions incorrectly. Can you provide me good configure defaults for such a "debug" build ? A: Yes: ./scripts/configure_for_debug.sh Q: **I am using CC=clang and FC=gfortran. Linking Fortran examples fails: it seems like some library is missing. What should I do?** A: Did you try `make FCLD=clang`? Q: I have machine X, compiler Y, compiling flags Z; is SpMV performance P with matrix M good ? A: In general, hard to tell. However you can `make hinfo.log' and send me (see CONTACTS) the hinfo.log file and your matrix in Matrix Market format (well, please don't send matrices by email but rather upload them somewhere on the web and send an URL to them). The hinfo.log file will contain useful compile and machine information. Then I *may* get an idea about the performance you should get with that matrix on that computer. Q: What is the Sparse BLAS ? A: It's a programming interface specification: [sparseblas_2001]: BLAS Technical Forum Standard, Chapter 3, Sparse BLAS http://www.netlib.org/blas/blast-forum/chapter3.pdf [dhp_2002]: An Overview of the Sparse Basic Linear Algebra Subprograms: The New Standard from the BLAS Technical Forum IAIN S. DUFF, CERFACS and Rutherford Appleton Laboratory MICHAEL A. HEROUX, Sandia National Laboratories ROLDAN POZO, National Institute of Standards and Technology [dv_2002]: Algorithm 818: A Reference Model Implementation of the Sparse BLAS in Fortran 95 IAIN S. DUFF, CERFACS, France and Atlas Centre, RAL, England CHRISTOF VÖMEL, CERFACS, France Q: Is there an easy way to profile librsb usage in my application ? A: Yes: build with --enable-librsb-stats and extract time elapsed in librsb via e.g.: RSB_REINIT_SINGLE_VALUE_GET(RSB_IO_WANT_LIBRSB_ETIME,&dt,errval). Q: Why another sparse matrix library ? A: This library is the fruit of the author's PhD work, focused on researching improved multi threaded and cache friendly matrix storage schemes for PSBLAS. Q: What are the key features of this library when compared to other ones ? A: Recursive storage, a code generator, parallel BLAS operations (including matrix assembly, matrix-matrix multiplication, transposed matrix-vector multiply), a battery of tests, a Sparse BLAS interface and a free software licensing. Q: How do I detect librsb from my package's configure script ? A: Add to your configure.ac: AH_TEMPLATE([HAVE_LIBRSB]) AC_CHECK_FUNC([rsb_lib_init],AC_DEFINE([HAVE_LIBRSB],[1],[librsb detected])) then rerun autoconf and invoke configure as: ./configure CFLAGS=`librsb-config --cflags` \ LDFLAGS=`librsb-config --ldflags --extra_libs` Q: How is correctness checked in the librsb test suite ? A: Different linear system generators and tester programs are being used to brute-force-test several routines and input combinations as possible. See 'make tests'; and run/edit the following tester programs if you are curious: test -f sbtc && ./sbtc||true # Sparse BLAS checker (C interface based) test -f sbtf && ./sbtf||true # Sparse BLAS checker (Fortran interface, opt.) ./rsbench -Q 10.0 # 10 seconds brute-test Q: Why did you write the library in C and not in C++ ? A: Mainly... Because C can be easily interfaced with C++ and Fortran. Because using a debugger under full fledged C++ is a headache. Because of the C's 'restrict' keyword. Q: Why did you use C and not Fortran ? A: This library is slightly system-oriented, and system calls interfacing is much easier in C. Also C's pointers arithmetic support plays a crucial role. Q: Is there a quick and easy way to perform an artificial performance test with huge matrices without having to program ? A: Sure. The following lines generate matrices of a specified dimension. You can play with them by changing the matrix size, for instance. ./rsbench -oa -Ob -qH -R --dense 1 --verbose ./rsbench -oa -Ob -qH -R --dense 1024 --verbose ./rsbench -oa -Ob -qH -R --lower 1024 --as-symmetric --verbose ./rsbench -oa -Ob -qH -R --dense 1000 --gen-lband 10 --gen-uband 3 ./rsbench -oa -Ob -qH -R --generate-diagonal 1000 Q: I've found a bug! What should I do ? A: First please make sure it is really a bug: read the documentation, check, double check. Then you can write a description of the problem, with a minimal program source code and data to replicate it. Then you can jump to the CONTACTS details section. Q: Is it possible to build matrices of, say, long double or long double complex or int or short int ? A: Yes, it's not a problem. You should invoke the configure script accordingly, e.g.: --enable-matrix-types="long double". If this breaks code compilation, feel free to contact the author (see the CONTACTS section). Q: Is there a way to compare the performance of this library to some other high performance libraries ? A: If you build rsbench with support for the Intel MKL library, then you can do performance comparisons with e.g.: # ./rsbench -oa -Ob -qH -R --gen-diag 100 --compare-competitors --verbose or use the following script: # bench/dense.sh ' ' Or even better, check out the --write-performance-record feature ; for details see the output of: # rsbench -oa -Ob --help Q: Is there a non-threaded (serial) version of librsb ? A: Yes: you can configure the library to work serially (with no OpenMP). See ./configure --help. Q: Is this library thread-safe ? A: Probably yes: no static buffers are being used, and reentrant C standard library functions are invoked. Q: Does the librsb library run on GPUs or Intel MIC ? A: It has been built on Intel MIC once, but not tested. Q: I built and compiled the code without enabling any BLAS type (S,D,C,Z), and both `make qtests' and `make tests' ran successfully outside the ./examples directory, but `make tests' breaks within ./examples directory. A: Well, the tests passed because the examples testing was simply skipped. The example programs need at least one of these types to work. Q: At build time I get many "unused variable" warnings. Why ? A: librsb accommodates many code generation and build time configuration options. Some combinations may turn off compilation of certain parts of the code, leading some variables to be unused. Q: Are there papers to read about the RSB format and algorithms ? A: Yes, the following: Michele Martone Efficient Multithreaded Untransposed, Transposed or Symmetric Sparse Matrix-Vector Multiplication with the Recursive Sparse Blocks Format Parallel Computing 40(7): 251-270 (2014) http://dx.doi.org/10.1016/j.parco.2014.03.008 Michele Martone Cache and Energy Efficiency of Sparse Matrix-Vector Multiplication for Different BLAS Numerical Types with the RSB Format Proceedings of the ParCo 2013 conference, September 2013, Munich, Germany PARCO 2013: 193-202 http://dx.doi.org/10.3233/978-1-61499-381-0-193 Michele Martone, Marcin Paprzycki, Salvatore Filippone: An Improved Sparse Matrix-Vector Multiply Based on Recursive Sparse Blocks Layout. LSSC 2011: 606-613 http://dx.doi.org/10.1007/978-3-642-29843-1_69 Michele Martone, Salvatore Filippone, Salvatore Tucci, Marcin Paprzycki, Maria Ganzha: Utilizing Recursive Storage in Sparse Matrix-Vector Multiplication - Preliminary Considerations. CATA 2010: 300-305 Michele Martone, Salvatore Filippone, Marcin Paprzycki, Salvatore Tucci: Assembling Recursively Stored Sparse Matrices. IMCSIT 2010: 317-325 http://www.proceedings2010.imcsit.org/pliks/205.pdf Michele Martone, Salvatore Filippone, Pawel Gepner, Marcin Paprzycki, Salvatore Tucci: Use of Hybrid Recursive CSR/COO Data Structures in Sparse Matrices-Vector Multiplication. IMCSIT 2010: 327-335 http://dx.doi.org/10.1109/SYNASC.2010.72 Michele Martone, Salvatore Filippone, Marcin Paprzycki, Salvatore Tucci: On BLAS Operations with Recursively Stored Sparse Matrices. SYNASC 2010: 49-56 http://dx.doi.org/10.1109/SYNASC.2010.72 Michele Martone, Salvatore Filippone, Marcin Paprzycki, Salvatore Tucci: On the Usage of 16 Bit Indices in Recursively Stored Sparse Matrices. SYNASC 2010: 57-64 http://dx.doi.org/10.1109/SYNASC.2010.77 Q: I have M4-related problems on IBM SP5/SP6 (my M4 preprocessor tries to regenerate code but it fails). What should I do ? A: A fix is to use a GNU M4 implementation e.g.: M4=/opt/freeware/bin/m4 ./configure ... e.g.: M4=gm4 ./configure ... or execute: touch *.h ; touch *.c ; make Or "./configure; make" the library on a different machine, then build a sources archive with `make dist', and use it on the original machine. -------------------------------------------------------------------------------- POSSIBLE / POTENTIAL FUTURE FEATURES / ENHANCEMENTS -------------------------------------------------------------------------------- * auxiliary functions for numerical vectors * CSC,BCSR,BCSC and other formats * (optional) loop unrolled kernels for BCSR/BCSC * performance prediction/estimation facilities (experimental) * types of the blocks, nonzeroes, and coordinates indices can be user specified * a code generator for BCSR, BCSC, VBR, VBC kernels * full support for BCSR, BCSC storages * automatic matrix blocking selection (for BCSR/BCSC) * an arbitrary subset of block size kernels can be specified to be generated * full support for VBR,VBC storages * recursive storage variants of blocked formats (non uniform blocking) * more auto-tuning and prediction control * use of assembly functions or intrinsics * the use of context variables (scenarios with multiple libraries using librsb completely independently at the same time are not supported) * enhanced in-place matrix assembly functions (useful for really huge matrices) -------------------------------------------------------------------------------- ABOUT THE INTERNALS -------------------------------------------------------------------------------- The following good practices are being followed during development of librsb. - only symbols beginning with `rsb_' or `blas_' are being exported. - internal functions are usually prefixed by `rsb__'. - no library internal function shall call any API function. If by using/inspecting the code you notice any of the above is being violated, please report about it. -------------------------------------------------------------------------------- BUGS -------------------------------------------------------------------------------- If you encounter any bug (e.g.: mismatch of library/program behaviour and documentation, please let me know about it by sending me (see CONTACTS) all relevant information (code snippet, originating data/matrix, config.log), in such a way that I can replicate the bug behaviour on my machines. If the bug occurred when using rsb interfaced to some proprietary library, please make sure the bug is in librsb. It may be of great help to you to build the library with the debug compile options on (e.g.: CFLAGS='-O0 -ggdb'), and with appropriate library verbosity levels (--enable-internals-error-verbosity, --enable-interface-error-verbosity and --enable-io-level options to configure) to better understand the program behaviour before sending a report. Make sure you have the latest version of the library when reporting a bug. -------------------------------------------------------------------------------- CONTACTS -------------------------------------------------------------------------------- You are welcome to contact the librsb author: Michele Martone < michelemartone AT users DOT sourceforge DOT net > Please specify "librsb" in the "Subject:" line of your emails. More information and downloads on http://sourceforge.net/projects/librsb Mailing list: https://lists.sourceforge.net/lists/listinfo/librsb-users -------------------------------------------------------------------------------- CREDITS (in alphabetical order) -------------------------------------------------------------------------------- For librsb-1.2: Marco Atzeri provided testing, patches to build librsb under cygwin over nearly each release, and spotted a few bugs. Fabio Cassini spotted an unintended conversion via sparsersb and +. John Donoghue spotted a rendering corner case bug. Sebastian Koenig spotted a computational bug in -rc6. Rafael Laboissiere helped a lot improving the documentation and the build system. Mu-Chu Lee provided a patch to fix sorting code crashing with > 10^9 nnz. Constanza Manassero spotted an inconsistency in the usmm/ussm interface. Markus Muetzel helped debugging rsb_mtx_rndr(). Dmitri Sergatskov spotted a double free in rsb_mtx_rndr() and convinced about the necessity of sanitizing memory usage. For librsb-1.1: Gilles Gouaillardet provided a patch for OpenMP-encapsulated I/O. Marco Restelli provided with testing and detailed comments and suggestions. For librsb-1.0: Francis Casson helped with testing and documentation reviewing during the first release. Nitya Hariharan helped revising early versions of the documentation. -------------------------------------------------------------------------------- LICENSE -------------------------------------------------------------------------------- This software is distributed under the terms of the Lesser GNU Public License version 3 (LGPLv3) or later. See the COPYING file for a copy of the LGPLv3. librsb is free software. To support it, consider writing "thank you" to the author and acknowledging use of librsb in your publications. That would be very appreciated. --------------------------------------------------------------------------------
For a quick startup, consider the following two programs.
The first, using the internal RSB interface:
And the second, using the Sparse BLAS interface:
For more, see the Example programs and code section.