Installation instructions

Linux and MacOS user🔗

In order to use the library you have to compile it from source, we do not provide yet any pre-build binaries.

BLASFEO supports two build system, make and CMake. make is the suggested one.

You can clone the repository and move inside the project folder with:

git clone https://github.com/giaf/blasfeo.git; cd blasfeo

Configuration🔗

Some compilation options can be tuned directly modifying the file Makefile.rule or adding the overridden value in a newly created Makefile.local which is not tracked by git.

The most important options can be specified with the following flags:

TARGET🔗

BLASFEO provides different implementation optimized for different computer architectures. The TARGET flag is used in the selection of hand-crafted assembly kernels (for LA=HIGH_PERFORMANCE), and in the choice of compilation flags (for all LA).

The target architecture has to be specified manually. If you are unsure about the correct target for you, on Linux you check the CPU model the following command can be used cat /proc/cpuinfo | grep name. Given the cpu model, the cpu architecture can be easily discerned e.g. with a browser search. Furthermore, the command cat /proc/cpuinfo | grep flags returns a list of the flags (like e.g. ssse3, avx, avx2, fma) describing the supported ISAs.

The current values for TARGET are:

LA backend🔗

The BLASFEO backend provides three possible implementations of each linear algebra routine (LA):

API🔗

The BLASFEO API is always exported.
Optionally, the flag BLAS_API gives the possibility to export a BLAS API for selected routines.
The further flag FORTRAN_BLAS_API controls whether the BLAS API naming is exported in the form blasfeo_dgemm or dgemm_.

MACRO_LEVEL🔗

For LA=HIGH_PERFORMANCE, the majority of BLASFEO code is assembly. Code modularity and reuse in assembly are achieved by using assembly subroutines with custom calling convention, which perform elementary operations on register-fitting sub-matrices. The linear algebra kernels are coded by gluing together the assembly subroutines.

In BLASFEO, assembly subroutines can be optionally be coded as macros, and expanded into the linear algebra kernels. This reduces the overhead of the subroutines calls (noticeable for very small matrices), at the expense of an increase in the library size.

The macro behavior is controlled using the option MACRO_LEVEL:

Compilation🔗

The command

make static_library -j $(nproc)

compiles the sources and creates the static library libblasfeo.a in the folder lib.
The command make shared_library -j $(nproc) is the equivalent for the shared library libblasfeo.so.

The command

make clean

clears previous builds. It is necessary to do so when changing TARGET or LA.
The command make deep_clean additionally removes compiled libraries, generated headers and test/benchmark resuts.

Installation🔗

The command

make install_static

will copy BLASFEO static library and headers in the installation path PREFIX. Note that the default path, PREFIX=/opt/blasfeo, requires admin privileges.
The command make install_shared is the equivalent for shared library.