short-description: CUDA module authors: - name: Olexa Bilaniuk years: [2019] has-copyright: false ...
Since: 0.50.0
This module provides helper functionality related to the CUDA Toolkit and building code using it.
Note: this module is unstable. It is only provided as a technology preview. Its API may change in arbitrary ways between releases or it might be removed from Meson altogether.
The module may be imported as follows:
cuda = [[#import]]('unstable-cuda')
It offers several useful functions that are enumerated below.
nvcc_arch_flags()
Since: 0.50.0
cuda.nvcc_arch_flags(cuda_version_string, ..., detected: string_or_array)
Returns a list of -gencode
flags that should be passed to cuda_args:
in order to compile a “fat binary” for the architectures/compute capabilities enumerated in the positional argument(s). The flags shall be acceptable to an NVCC with CUDA Toolkit version string cuda_version_string
.
A set of architectures and/or compute capabilities may be specified by:
'All'
, 'Common'
or 'Auto'
'Kepler'
, 'Maxwell+Tegra'
, 'Turing'
) and/or'3.0'
, '3.5'
, '5.3'
, '7.5'
)A suffix of +PTX
requests PTX code generation for the given architecture. A compute capability given as A.B(X.Y)
requests PTX generation for an older virtual architecture X.Y
before binary generation for a newer architecture A.B
.
Multiple architectures and compute capabilities may be passed in using
), comma (,
) or semicolon (;
)-separated stringsThe single-word architectural sets 'All'
, 'Common'
or 'Auto'
cannot be mixed with architecture names or compute capabilities. Their interpretation is:
Name | Compute Capability |
---|---|
'All' | All CCs supported by given NVCC compiler. |
'Common' | Relatively common CCs supported by given NVCC compiler. Generally excludes Tegra and Tesla devices. |
'Auto' | The CCs provided by the detected: keyword, filtered for support by given NVCC compiler. |
The supported architecture names and their corresponding compute capabilities are:
Name | Compute Capability |
---|---|
'Fermi' | 2.0, 2.1(2.0) |
'Kepler' | 3.0, 3.5 |
'Kepler+Tegra' | 3.2 |
'Kepler+Tesla' | 3.7 |
'Maxwell' | 5.0, 5.2 |
'Maxwell+Tegra' | 5.3 |
'Pascal' | 6.0, 6.1 |
'Pascal+Tegra' | 6.2 |
'Volta' | 7.0 |
'Xavier' | 7.2 |
'Turing' | 7.5 |
'Ampere' | 8.0, 8.6 |
Examples:
cuda.nvcc_arch_flags('10.0', '3.0', '3.5', '5.0+PTX') cuda.nvcc_arch_flags('10.0', ['3.0', '3.5', '5.0+PTX']) cuda.nvcc_arch_flags('10.0', [['3.0', '3.5'], '5.0+PTX']) cuda.nvcc_arch_flags('10.0', '3.0 3.5 5.0+PTX') cuda.nvcc_arch_flags('10.0', '3.0,3.5,5.0+PTX') cuda.nvcc_arch_flags('10.0', '3.0;3.5;5.0+PTX') cuda.nvcc_arch_flags('10.0', 'Kepler 5.0+PTX') # Returns ['-gencode', 'arch=compute_30,code=sm_30', # '-gencode', 'arch=compute_35,code=sm_35', # '-gencode', 'arch=compute_50,code=sm_50', # '-gencode', 'arch=compute_50,code=compute_50'] cuda.nvcc_arch_flags('10.0', '3.5(3.0)') # Returns ['-gencode', 'arch=compute_30,code=sm_35'] cuda.nvcc_arch_flags('8.0', 'Common') # Returns ['-gencode', 'arch=compute_30,code=sm_30', # '-gencode', 'arch=compute_35,code=sm_35', # '-gencode', 'arch=compute_50,code=sm_50', # '-gencode', 'arch=compute_52,code=sm_52', # '-gencode', 'arch=compute_60,code=sm_60', # '-gencode', 'arch=compute_61,code=sm_61', # '-gencode', 'arch=compute_61,code=compute_61'] cuda.nvcc_arch_flags('9.2', 'Auto', detected: '6.0 6.0 6.0 6.0') cuda.nvcc_arch_flags('9.2', 'Auto', detected: ['6.0', '6.0', '6.0', '6.0']) # Returns ['-gencode', 'arch=compute_60,code=sm_60'] cuda.nvcc_arch_flags(nvcc, 'All') # Returns ['-gencode', 'arch=compute_20,code=sm_20', # '-gencode', 'arch=compute_20,code=sm_21', # '-gencode', 'arch=compute_30,code=sm_30', # '-gencode', 'arch=compute_32,code=sm_32', # '-gencode', 'arch=compute_35,code=sm_35', # '-gencode', 'arch=compute_37,code=sm_37', # '-gencode', 'arch=compute_50,code=sm_50', # nvcc.version() < 7.0 # '-gencode', 'arch=compute_52,code=sm_52', # '-gencode', 'arch=compute_53,code=sm_53', # nvcc.version() >= 7.0 # '-gencode', 'arch=compute_60,code=sm_60', # '-gencode', 'arch=compute_61,code=sm_61', # nvcc.version() >= 8.0 # '-gencode', 'arch=compute_70,code=sm_70', # '-gencode', 'arch=compute_72,code=sm_72', # nvcc.version() >= 9.0 # '-gencode', 'arch=compute_75,code=sm_75'] # nvcc.version() >= 10.0
Note: This function is intended to closely replicate CMake's FindCUDA module function CUDA_SELECT_NVCC_ARCH_FLAGS(out_variable, [list of CUDA compute architectures])
nvcc_arch_readable()
Since: 0.50.0
cuda.nvcc_arch_readable(cuda_version_string, ..., detected: string_or_array)
Has precisely the same interface as nvcc_arch_flags()
, but rather than returning a list of flags, it returns a “readable” list of architectures that will be compiled for. The output of this function is solely intended for informative message printing.
archs = '3.0 3.5 5.0+PTX' readable = cuda.nvcc_arch_readable('10.0', archs) message('Building for architectures ' + ' '.join(readable))
This will print
Message: Building for architectures sm30 sm35 sm50 compute50
Note: This function is intended to closely replicate CMake's FindCUDA module function CUDA_SELECT_NVCC_ARCH_FLAGS(out_variable, [list of CUDA compute architectures])
min_driver_version()
Since: 0.50.0
cuda.min_driver_version(cuda_version_string)
Returns the minimum NVIDIA proprietary driver version required, on the host system, by kernels compiled with a CUDA Toolkit with the given version string.
The output of this function is generally intended for informative message printing, but could be used for assertions or to conditionally enable features known to exist within the minimum NVIDIA driver required.