| # Unstable SIMD module |
| |
| This module provides helper functionality to build code with SIMD instructions. |
| Available since 0.42.0. |
| |
| **Note**: this module is unstable. It is only provided as a technology |
| preview. Its API may change in arbitrary ways between releases or it |
| might be removed from Meson altogether. |
| |
| ## Usage |
| |
| This module is designed for the use case where you have an algorithm |
| with one or more SIMD implementation and you choose which one to use |
| at runtime. |
| |
| The module provides one method, `check`, which is used like this: |
| |
| rval = simd.check('mysimds', |
| mmx : 'simd_mmx.c', |
| sse : 'simd_sse.c', |
| sse2 : 'simd_sse2.c', |
| sse3 : 'simd_sse3.c', |
| ssse3 : 'simd_ssse3.c', |
| sse41 : 'simd_sse41.c', |
| sse42 : 'simd_sse42.c', |
| avx : 'simd_avx.c', |
| avx2 : 'simd_avx2.c', |
| neon : 'simd_neon.c', |
| compiler : cc) |
| |
| Here the individual files contain the accelerated versions of the |
| functions in question. The `compiler` keyword argument takes the |
| compiler you are going to use to compile them. The function returns an |
| array with two values. The first value is a bunch of libraries that |
| contain the compiled code. Any SIMD code that the compiler can't |
| compile (for example, Neon instructions on an x86 machine) are |
| ignored. You should pass this value to the desired target using |
| `link_with`. The second value is a `configuration_data` object that |
| contains true for all the values that were supported. For example if |
| the compiler did support sse2 instructions, then the object would have |
| `HAVE_SSE2` set to 1. |
| |
| Generating code to detect the proper instruction set at runtime is |
| straightforward. First you create a header with the configuration |
| object and then a chooser function that looks like this: |
| |
| void (*fptr)(type_of_function_here) = NULL; |
| |
| #if HAVE_NEON |
| if(fptr == NULL && neon_available()) { |
| fptr = neon_accelerated_function; |
| } |
| #endif |
| #if HAVE_AVX2 |
| if(fptr == NULL && avx2_available()) { |
| fptr = avx_accelerated_function; |
| } |
| #endif |
| |
| ... |
| |
| if(fptr == NULL) { |
| fptr = default_function; |
| } |
| |
| Each source file provides two functions, the `xxx_available` function |
| to query whether the CPU currently in use supports the instruction set |
| and `xxx_accelerated_function` that is the corresponding accelerated |
| implementation. |
| |
| At the end of this function the function pointer points to the fastest |
| available implementation and can be invoked to do the computation. |