CUDA: Generic all and all-major support

Commit 14d8a276 (CUDA: Support nvcc 11.5 new -arch=all|all-major flags, 2021-08-17) added all and all-major options to CUDA_ARCHITECTURES. These are fairly generic and likely to see real-world use by distributors. Thus it's desirable to support these also for Clang and older NVCC versions. The supported architectures are dependent on the toolkit version. We determine the toolkit version prior to compiler detection. For NVCC we get the version from the vendor identification output, but for Clang we need to invoke NVCC separately. The architecture information is mostly based on the Wikipedia list with the earliest supported version being CUDA 7.0. This could be documented and expanded in the future to allow projects to query CUDA toolkit version and architecture information. For Clang we additionally constrain based on its support. Additionally the architecture mismatch detection logic is fixed, improved and updated for generic support: * Commit 01428c55 (CUDA: Fail fast if CMAKE_CUDA_ARCHITECTURES doesn't work during detection, 2020-08-29) enabled CMAKE_CUDA_COMPILER_ID_REQUIRE_SUCCESS if CMAKE_CUDA_ARCHITECTURES is specified. This results in CMakeDetermineCompilerID.cmake printing the compiler error and our code for presenting the mismatch in a user-friendly way being useless. The custom logic seems preferable so go back to not enabling it. * Commit 14d8a276 (CUDA: Support nvcc 11.5 new -arch=all|all-major flags, 2021-08-17) tried to support CMP0054 but forgot to add x to the interpolated result. Thus the conditions would always evaluate to false. This is fixed as a byproduct of removing NVIDIA specific checks, improving the error message and replacing architectures_mode with a simpler architectures_explicit. Visual Studio support omits testing the flags during detection due to complexities in determining the toolkit version when using it. A long-term proper implementation would be #23161. Implements #22860.
author: Raul Tambre <raul@tambre.ee> 2021-12-19 10:49:58 (GMT)
committer: Raul Tambre <raul@tambre.ee> 2022-02-01 16:25:20 (GMT)
commit: 8f64df0a7c2c9126017847f2bb8d37bc54ea0338 (patch)
tree: 589ad2a37c64cbde54e8e28006ad4393a7a5cd21 /Tests/CudaOnly
parent: 5305d5aa1a6900c64a5833176b43a21acb13fb30 (diff)
download: CMake-8f64df0a7c2c9126017847f2bb8d37bc54ea0338.zip
CMake-8f64df0a7c2c9126017847f2bb8d37bc54ea0338.tar.gz
CMake-8f64df0a7c2c9126017847f2bb8d37bc54ea0338.tar.bz2
1 files changed, 46 insertions, 34 deletions
diff --git a/Tests/CudaOnly/All/CMakeLists.txt b/Tests/CudaOnly/All/CMakeLists.txt
index fe29bb0..ba32e9a 100644
--- a/Tests/CudaOnly/All/CMakeLists.txt
+++ b/Tests/CudaOnly/All/CMakeLists.txt
@@ -2,43 +2,55 @@ cmake_minimum_required(VERSION 3.20)
 project(CudaOnlyAll CUDA)
 
 if(CMAKE_CUDA_COMPILER_ID STREQUAL "NVIDIA" AND
-   CMAKE_CUDA_COMPILER_VERSION VERSION_GREATER_EQUAL 11.5.0)
-
+   CMAKE_CUDA_COMPILER_VERSION VERSION_GREATER_EQUAL 8.0)
   set(compile_options -Wno-deprecated-gpu-targets)
-  function(verify_output flag output_var)
-    string(REGEX MATCHALL "-arch compute_([0-9]+)" target_archs "${${output_var}}")
-    list(LENGTH target_archs count)
-    if(count LESS 2)
-      message(FATAL_ERROR "${flag} failed to map to multiple architectures")
-    endif()
-  endfunction()
 endif()
 
-if(COMMAND verify_output)
-  set(try_compile_flags -v ${compile_options})
-
-  set(CMAKE_CUDA_ARCHITECTURES all)
-  try_compile(all_archs_compiles
-    ${CMAKE_CURRENT_BINARY_DIR}/try_compile/all_archs_compiles
-    ${CMAKE_CURRENT_SOURCE_DIR}/main.cu
-    COMPILE_DEFINITIONS ${try_compile_flags}
-    OUTPUT_VARIABLE output
-    )
-  verify_output(all output)
-
-  set(CMAKE_CUDA_ARCHITECTURES all-major)
-  try_compile(all_major_archs_compiles
-    ${CMAKE_CURRENT_BINARY_DIR}/try_compile/all_major_archs_compiles
-    ${CMAKE_CURRENT_SOURCE_DIR}/main.cu
-    COMPILE_DEFINITIONS ${try_compile_flags}
-    OUTPUT_VARIABLE output
-    )
-  verify_output(all-major output)
-
-  if(all_archs_compiles AND all_major_archs_compiles)
-    add_executable(CudaOnlyAll main.cu)
-    target_compile_options(CudaOnlyAll PRIVATE ${compile_options})
+function(verify_output flag)
+  string(REPLACE "-" "_" architectures "${flag}")
+  string(TOUPPER "${architectures}" architectures)
+  set(architectures "${CMAKE_CUDA_ARCHITECTURES_${architectures}}")
+
+  if(CMAKE_CUDA_COMPILER_ID STREQUAL "Clang")
+    set(match_regex "-target-cpu sm_([0-9]+)")
+  elseif(CMAKE_CUDA_COMPILER_ID STREQUAL "NVIDIA")
+    set(match_regex "-arch compute_([0-9]+)")
+  endif()
+
+  string(REGEX MATCHALL "${match_regex}" target_cpus "${output}")
+
+  foreach(cpu ${target_cpus})
+    string(REGEX MATCH "${match_regex}" dont_care "${cpu}")
+    list(APPEND command_archs "${CMAKE_MATCH_1}")
+  endforeach()
+
+  list(SORT command_archs)
+  if(NOT "${command_archs}" STREQUAL "${architectures}")
+    message(FATAL_ERROR "Architectures used for \"${flag}\" don't match the reference (\"${command_archs}\" != \"${architectures}\").")
   endif()
-else()
+endfunction()
+
+set(try_compile_flags -v ${compile_options})
+
+set(CMAKE_CUDA_ARCHITECTURES all)
+try_compile(all_archs_compiles
+  ${CMAKE_CURRENT_BINARY_DIR}/try_compile/all_archs_compiles
+  ${CMAKE_CURRENT_SOURCE_DIR}/main.cu
+  COMPILE_DEFINITIONS ${try_compile_flags}
+  OUTPUT_VARIABLE output
+  )
+verify_output(all)
+
+set(CMAKE_CUDA_ARCHITECTURES all-major)
+try_compile(all_major_archs_compiles
+  ${CMAKE_CURRENT_BINARY_DIR}/try_compile/all_major_archs_compiles
+  ${CMAKE_CURRENT_SOURCE_DIR}/main.cu
+  COMPILE_DEFINITIONS ${try_compile_flags}
+  OUTPUT_VARIABLE output
+  )
+verify_output(all-major)
+
+if(all_archs_compiles AND all_major_archs_compiles)
   add_executable(CudaOnlyAll main.cu)
+  target_compile_options(CudaOnlyAll PRIVATE ${compile_options})
 endif()
author	Raul Tambre <raul@tambre.ee>	2021-12-19 10:49:58 (GMT)
committer	Raul Tambre <raul@tambre.ee>	2022-02-01 16:25:20 (GMT)
commit	8f64df0a7c2c9126017847f2bb8d37bc54ea0338 (patch)
tree	589ad2a37c64cbde54e8e28006ad4393a7a5cd21 /Tests/CudaOnly
parent	5305d5aa1a6900c64a5833176b43a21acb13fb30 (diff)
download	CMake-8f64df0a7c2c9126017847f2bb8d37bc54ea0338.zip CMake-8f64df0a7c2c9126017847f2bb8d37bc54ea0338.tar.gz CMake-8f64df0a7c2c9126017847f2bb8d37bc54ea0338.tar.bz2