BLASFEO (as BLAS For Embedded Optimization) provides a set of basic linear algebra routines, performance-optimized for matrices of moderate size (up to a couple hundreds elements in each dimension), as typically encountered in embedded optimization applications.
In the target matrix size range, the optimized version of BLASFEO outperforms both open-source (e.g. OpenBLAS, BLIS, ATLAS) and proprietary (e.g. MKL) BLAS and LAPACK implementations.
The currently supported computer architectures (TARGET) are:
X64_INTEL_HASWELL Intel Haswell architecture or newer, AVX2 and FMA ISA, 64-bit OS.X64_INTEL_SANDY_BRIDGE Intel Sandy-Bridge architecture, AVX ISA, 64-bit OS.X64_INTEL_CORE Intel Core architecture, SSE3 ISA, 64-bit OS.X64_AMD_BULLDOZER AMD Bulldozer architecture, AVX and FMA ISAs, 64-bit OS.X86_AMD_JAGUAR AMD Jaguar architecture, AVX ISA, 32-bit OS.X86_AMD_BARCELONA AMD Barcelona architecture, SSE3 ISA, 32-bit OS.ARMV8A_ARM_CORTEX_A57 ARMv8A architecture, VFPv4 and NEONv2 ISAs, 64-bit OS.ARMV8A_ARM_CORTEX_A53 ARMv8A architecture, VFPv4 and NEONv2 ISAs, 64-bit OS.ARMV7A_ARM_CORTEX_A15 ARMv7A architecture, VFPv3 and NEON ISAs, 32-bit OS.ARMV7A_ARM_CORTEX_A7 ARMv7A architecture, VFPv3 and NEON ISAs, 32-bit OS.GENERIC Generic target, coded in C, giving better performance if the architecture provides more than 16 scalar FP registers (e.g. many RISC such as ARM).The BLASFEO backend provides three possible implementations of each linear algebra routine (LA):
HIGH_PERFORMANCE: target-tailored; performance-optimized for cache resident matrices; panel-major matrix formatREFERENCE: target-unspecific lightly-optimizated; small code footprint; column-major matrix formatBLAS_WRAPPER: call to external BLAS and LAPACK libraries; column-major matrix formatThe BLASFEO API is always exported.
Optionally, the flag BLAS_API gives the possibility to export a BLAS API for selected routines.
The further flag FORTRAN_BLAS_API controls whether the BLAS API naming is exported in the form blasfeo_dgemm or dgemm_.
The currently supported operating systems (OS) are:
LINUX Linux for x86_64 64-bit, x86 32-bit, ARMv8A 64-bit, ARMv7A 32-bitWINDOWS Windows for x86_64 64-bitMAC MacOS for x86_64 64-bitBLASFEO employs structures to describe matrices (blasfeo_dmat) and vectors (blasfeo_dvec), defined in include/blasfeo_common.h.
The actual implementation of blasfeo_dmat and blasfeo_dvec depends on the LA and TARGET choice.
More information about BLASFEO can be found in the ArXiv paper, or in the slides of Blis Retreat in 2017 or in the video.
As application examples, BLASFEO is employed in the Model Predictive Control software packages HPIPM, HPMPC and acados.
Notes: