BLASFEO (as BLAS For Embedded Optimization) provides a set of basic linear algebra routines, performance-optimized for matrices of moderate size (up to a couple hundreds elements in each dimension), as typically encountered in embedded optimization applications.
In the target matrix size range, the optimized version of BLASFEO outperforms both open-source (e.g. OpenBLAS, BLIS, ATLAS) and proprietary (e.g. MKL) BLAS and LAPACK implementations.

haswell dgemm nt haswell dgemm nt
DGEMM_NT and DPOTRF_L routines on Intel Haswell CPU Core i7 4810MQ @3.4GHz (theoretical max throughput of 54.4 GFlops)

The currently supported computer architectures (TARGET) are:

The BLASFEO backend provides three possible implementations of each linear algebra routine (LA):

The BLASFEO API is always exported.
Optionally, the flag BLAS_API gives the possibility to export a BLAS API for selected routines.
The further flag FORTRAN_BLAS_API controls whether the BLAS API naming is exported in the form blasfeo_dgemm or dgemm_.

The currently supported operating systems (OS) are:

BLASFEO employs structures to describe matrices (blasfeo_dmat) and vectors (blasfeo_dvec), defined in include/blasfeo_common.h.
The actual implementation of blasfeo_dmat and blasfeo_dvec depends on the LA and TARGET choice.

More scientific information can be found in:

As application examples BLASFEO is employed in the Model Predictive Control software packages: