The ability to disable adding architectures completely for packaging purposes
and cases requiring passing the architectures flags explicitly has been
requested.
Support a false value for CUDA_ARCHITECTURES and CMAKE_CUDA_ARCHITECTURES
for this purpose.
Implements #20821.
Clang isn't very good at finding the installed CUDA toolkit.
The upstream recommendation is that we should pass the toolkit explicitly.
Additionally:
* Avoids Clang having to search for the toolkit on every invocation.
* Allows the user to use a toolkit from a non-standard location by simply
setting CUDAToolkit_ROOT. The same way as with FindCUDAToolkit.
Clang wants the directory containing the device library and version.txt as the
toolkit path.
We thus pass the newly introduced CUDAToolkit_LIBRARY_ROOT as the toolkit path.
We save CUDAToolkit_ROOT_DIR and CUDAToolkit_LIBRARY_ROOT on Clang to have them
available in try_compile() and avoid unnecessary re-searching or a possibly
different installation being found in FindCUDAToolkit.
This however means that the selected toolkit can't be changed after the initial
language enablement.
We now determine CUDA compiler ID before doing actual detection, as we don't
want to spend time finding the CUDA toolkit for NVIDIA.
Implements #20754.
Since commit 729d997f10 (Precompile Headers: Add REUSE_FROM signature,
2019-08-30, v3.16.0-rc1~101^2), `GetPchFileObject` handles the case that
it is called first for another target's `REUSE_FROM` by calling
`AddSource` to make sure `GetObjectName` can produce the correct object
name. However, `AddSource` causes `ClearSourcesCache` to be called,
which since commit a9f4f58f0c (cmGeneratorTarget: Clear AllConfigSources
in ClearSourcesCache, 2020-05-15, v3.16.7~2^2) now correctly erases the
`AllConfigSources` structure. This is okay during `AddPchDependencies`,
but there is another code path in which it is problematic.
When the Visual Studio generator's `WriteAllSources` method is looping
over the sources, the `cmake_pch.cxx` source is encountered first. This
causes `OutputSourceSpecificFlags` to call `GetPchCreateCompileOptions`,
which calls `GetPchFile`, which under MSVC with `CMAKE_LINK_PCH` calls
`GetPchFileObject`. That leads to `ClearSourcesCache` erasing the
structure over which `WriteAllSources` is iterating!
This bug is caught by our `RunCMake.PrecompileHeaders` test when run
with the VS generator as of the commit that exposed it by fixing
`ClearSourcesCache`. However, that change was backported to the CMake
3.16 series after testing only with later versions versions that contain
commit a55df20499 (Multi-Ninja: Add precompile headers support,
2020-01-10, v3.17.0-rc1~136^2). By adding proper multi-config support
for PCH, that commit taught `cmLocalGenerator::AddPchDependencies` to
call `GetPchFile` with the real set of configurations instead of just
the empty string. This allows the `GetPchFile` cache of PCH sources to
be populated up front so that the later calls to it in the
`WriteAllSources` loop as described above do not actually call
`GetPchFileObject` or `ClearSourcesCache`. That hid the problem.
Fix this by re-ordering calls to `AddPchDependencies` to handle
`REUSE_FROM` targets only after the targets whose PCH they re-use.
Remove the now-unnecessary call to `AddSource` from `GetPchFileObject`
so that `ClearSourcesCache` is never called during `WriteAllSources`.
Update the PchReuseFrom test case to cover an ordering of targets that
causes generators to encounter a `REUSE_FROM` target before the target
whose PCH it re-uses.
Fixes: #20770
a653ca9504 Tests: Update CUDA tests to work with Clang
5df21adf46 CUDA: Add support for Clang compiler
dc2eae1f91 FindCUDAToolkit: Factor out discovery code into a separate file
70be10cbf4 CUDA: Remove toolkit include dirs from implicit include dirs only with NVIDIA
Acked-by: Kitware Robot <kwrobot@kitware.com>
Acked-by: Artem Belevich <tra@google.com>
Acked-by: Robert Maynard <robert.maynard@kitware.com>
Acked-by: Axel Huebl <axel.huebl@plasma.ninja>
Acked-by: friendnick <ikoval67@gmail.com>
Acked-by: Patrik Huber <patrikhuber@gmail.com>
Merge-request: !4442
When crosscompiling we pass the sysroot.
We need to try various architecture flags. Clang doesn't automatically
select one that works. First try the ones that are more likely to work
for modern installations:
* <=sm_50 is deprecated since CUDA 10.2, try sm_52 first for
future compatibility.
* <=sm_20 is removed since CUDA 9.0, try sm_30.
Otherwise fallback to Clang's current default. Currently that's `sm_20`,
the lowest it supports.
Separable compilation isn't supported yet.
Fixes: #16586
In commit 40aa6c059c (cmGeneratorTarget: Add method to collect all
sources for all configs, 2017-04-10, v3.9.0-rc1~268^2~5) we forgot to
update `ClearSourcesCache` to also clear `AllConfigSources`. This leads
to subtle cases where code paths like PCH handling that add sources
during generation break depending on ordering.
Suggested-by: Christian Fersch
Fixes: #20712, #20702
In multi-config generators we memoize the computed set of source files
for a target to avoid repeating the computation when the set does not
depend on the configuration. We already track whether generator
expressions in `SOURCES` or `INTERFACE_SOURCES` reference the
configuration (`$<CONFIG:...>`). However, we previously forgot to track
whether the set of libraries whose `INTERFACE_SOURCES` are considered
depends on the configuration. This caused multi-config generators to
use the first configuration's set of sources for all configurations
in cases such as
target_link_libraries(tgt PRIVATE $<$<CONFIG:Debug>:iface_debug>)
where the `iface_debug` target has `INTERFACE_SOURCES`.
Fix this by also tracking config-dependence of the list of libraries for
evaluation of the list of source files.
Fixes: #20683
Report in `cmLinkImplementationLibraries` and `cmLinkInterfaceLibraries`
whether the list of libraries depends on a genex referencing the
configuration. We already track whether a genex references the head
target.
properties LINK_OPTIONS and INTERFACE_LINK_OPTIONS are propagated
to the device link step.
To control which options are selected for normal link and device link steps,
the $<DEVICE_LINK> and $<HOST_LINK> generator expressions can be used.
Fixes: #18265
These generator expressions can only be used in link options properties.
These expressions return the arguments respectively for device and host link
step, otherwise return an empty string.
Simplifies CUDA target architecture handling.
Required for Clang support as Clang doesn't automatically select a supported architecture.
We detect a supported architecture during compiler identification and set CMAKE_CUDA_ARCHITECTURES to it.
Introduces CMP0104 for backwards compatibility with manually setting code generation flags with NVCC.
Implements #17963.
Arguably, many of these are bugs in `clang-tidy`. An if/else tree with
other conditionals between cloned blocks may be relying on the
intermediate logic to fall out of the case and inverting this logic may
be non-trivial.
See: https://bugs.llvm.org/show_bug.cgi?id=44165
Teach include directory computation for Swift to implicitly propagate
the `Swift_MODULE_DIRECTORY` of all linked targets as include
directories. This is required to ensure that the swiftmodule of a
linked target is accessible to the compiler of the current target.
Fixes: #19272