Implement P3663 (Future-proof submdspan_mapping)#408
Implement P3663 (Future-proof submdspan_mapping)#408mhoemmen wants to merge 105 commits intokokkos:stablefrom
Conversation
It currently requires the C++23 feature "deducing this" in order to compile. I've tested it with dev Clang (21.0.0). It should work fine with Clang 20.
Also implement exposition-only __mdspan_integral_constant_like concept, which is useful for canonicalization and Mandates.
sizeof...(Slices) == 0 case builds and passes tests.
Set the following CMake options. -DCMAKE_CXX_FLAGS="-std=c++2c" -DMDSPAN_CXX_STANDARD=26 You'll get a warning, but it seems to work.
* canonical-ice * subtract-ice * de-ice * check-static-bounds
submdspan_canonicalize_slices now works for one strided_slice. Fix canonical_ice (wrong order in subtraction; wording is fine).
Attempt to change over submdspan to use canonicalization without doing much else. It fails to build; a lot of layout mappings seem to get the wrong mapping result type. It builds and passes tests with P3663 support off.
submdspan builds now when submdspan_mapping is called with canonical slices, but the submdspan test fails. Investigate whether this is a bug in the implementation.
Fix extent computation for std::complex. It builds and passes tests. NEXT STEPS: * Make it ill-formed to call submdspan_mapping with non-canonical slice types * Make P3663 version only use constant_wrapper and canonical slice types in submdspan_mapping_impl functions
(submdspan_mapping_impl)
All submdspan_mapping_impl functions now check that every k-th slice type is a canonical k-th slice type. This ensures that submdspan_mapping_impl is only well-formed for canonical k-th slice types, and is ill-formed otherwise.
in P3663 branch only.
Don't need to specialize for all integral-constant-like, just for std::constant_wrapper.
| ) | ||
| MDSPAN_INLINE_FUNCTION | ||
| constexpr Integral first_of(const Integral &i) { | ||
| // FIXME (mfh 2025/06/06) This is broken, but it's better than it was. |
There was a problem hiding this comment.
We should just fix the non-P3663 branch if we intend to keep it.
| return {}; | ||
| } | ||
| #else | ||
| // NOTE This is technically not conforming. |
There was a problem hiding this comment.
We should just fix this.
| // in constant expressions if they are integral-constant-like. | ||
|
|
||
| template<size_t k, class S_k, class IndexType, size_t... Exts> | ||
| constexpr check_static_bounds_result check_static_bounds( |
There was a problem hiding this comment.
FIXME In R3, this returns bool instead of an enum.
| else if constexpr (detail::is_std_complex<Slice>) { | ||
| #if ! defined(MDSPAN_CONSTANT_WRAPPER_WORKAROUND) | ||
| // GCC 11.4.0 (C++20) accepts this code, but Clang 14 does not. | ||
| return strided_slice{ |
There was a problem hiding this comment.
Just delete this branch.
| const extents<IndexType, Extents...>& exts, | ||
| Slices... slices) | ||
| { | ||
| return std::tuple{ |
There was a problem hiding this comment.
Consider using Kokkos::tuple instead of std::tuple.
| return static_cast<common_t>(first_of(slice)) == | ||
| static_cast<common_t>(ext); | ||
| #else | ||
| // NOTE (mfh 2025/06/06) The original implementation was not conforming. |
There was a problem hiding this comment.
We should decide whether to fix the non-P3663 case or just remove it and make P3663 the only implementation.
| template<class OffsetType, class ExtentType, class StrideType> | ||
| void test_strided_slice(OffsetType offset, ExtentType extent, StrideType stride) | ||
| { | ||
| // Some compilers are bad at CTAD for aggregates. |
There was a problem hiding this comment.
We should consider branching based on compilers where it's known to work. It would be good to test that CTAD works.
|
PR #447 will remove import of |
nmm0
left a comment
There was a problem hiding this comment.
Ok some initial comments, I got through a good chunk of it but not everything.
| #if(MDSPAN_ENABLE_CUDA) | ||
| # add_subdirectory(cuda) | ||
| #endif() |
There was a problem hiding this comment.
| #if(MDSPAN_ENABLE_CUDA) | |
| # add_subdirectory(cuda) | |
| #endif() |
| @@ -0,0 +1,265 @@ | |||
| #pragma once | |||
There was a problem hiding this comment.
I think we should pull this into a separate PR.
Also since we are pulling this in should we ask Zach if it's under a license or public domain? I couldn't find any info in the source code so I suspect it is the latter.
There was a problem hiding this comment.
@nmm0 I'm inclined to check whether any (VERY VERY NEW) compiler comes with a constant_wrapper implementation. If it does, we can test the code with the actual std::constant_wrapper, remove the back-port, and leave only the pre-C++26 work-around (that looks like integral_constant). What do you think?
There was a problem hiding this comment.
Compiler Explorer's GCC trunk has std::constant_wrapper: https://godbolt.org/z/3e4qvEvv1 . That means we have a way to test this with the actual std::constant_wrapper. That motivates removing the back-port, as long as we know that it actually works with std::constant_wrapper.
C++26 may still see changes to std::constant_wrapper -- there's a fix paper in flight -- but I don't think it would apply to anything in this PR. (The paper would change behavior for callables.)
There was a problem hiding this comment.
It does look like GCC16 will have std::constant_wrapper (so i guess it made the cut): https://gcc.gnu.org/gcc-16/changes.html
I'm fine with having a constant wrapper backport though, just will be easier to review in another PR
There was a problem hiding this comment.
@nmm0 The back-port is really only useful for Clang 21 and just barely for GCC 15. Anything earlier won't have a chance of building a constant_wrapper that looks enough like constant_wrapper to call it a back-port.
I've removed the back-port. GCC trunk builds it correctly. I'm not convinced that MDSPAN_ENABLE_P3663 is always defined in the build, though.
|
|
||
| #include <type_traits> | ||
|
|
||
| namespace MDSPAN_IMPL_STANDARD_NAMESPACE { |
There was a problem hiding this comment.
I think this is fine here but at some point if we reorganize files I think we could move all of these little backports of C++ things that we need into its own utility folder.
There was a problem hiding this comment.
That's a good idea! If we end up removing the constant_wrapper back-port (please see my comment above), then there won't be quite so many long files here.
|
|
||
| option(MDSPAN_ENABLE_CONCEPTS "Try to enable concepts support by giving extra flags." On) | ||
|
|
||
| option(MDSPAN_ENABLE_P3663 "Enable implementation of P3663 (Future-proof submdspan_mapping)." On) |
There was a problem hiding this comment.
I think this is actually our only cmake-based feature switch? So this actually leads to the question of how we do this, because right now if someone is using mdspan via cmake this does nothing (as it only enables a compile definition for benchmarks).
So I suppose the question is do we want to add them to a target (e.g. via target_compile_definitions) or via a #cmakedefine sort of thing like Kokkos does where we generate a header with the define. We could even let it be overrideable via a define.
I actually am leaning towards the latter because we could use it in our single-header mode and the value could be overriden if using the header only mode via a define.
There was a problem hiding this comment.
@nmm0 wrote:
... because right now if someone is using mdspan via cmake this does nothing....
Practically, it changes the implementation by changing the macro definitions in the headers -- at least it should! I might be missing something here.
Should we consider just removing this option and making the library implement P3663 unconditionally?
|
Compiler Explorer's version of GCC trunk has |
For both P3663 and no-P3663 branches, fix __mdspan_is_index_like_v so that it now correctly excludes std::bool_constant or any integral-constant-like with bool value type.
to use a lambda with `auto&&... canonical_slices` capture instead of a named functor.
GCC trunk ("16.0.1") has std::constant_wrapper.
This commit fixes any build errors that resulted from
using that as the "constant_wrapper" in mdspan.
It also works around some pack indexing issues
that seem unique to GCC.
|
Hi @nmm0 ! GCC trunk ("16.0.1") has As a result, I think it's safe to remove the |
GCC trunk ("16.0.1") has std::constant_wrapper.
This commit builds and passes tests with that.
Thus, I've removed the constant_wrapper back-port
(that really only worked with GCC 15 and Clang 21).
There are now only two branches.
1. Use actual std::constant_wrapper and std::cw
2. Use something that only superficially has the same
interface but is really just an integral_constant clone
Implement P3663 (Future-proof
submdspan_mapping).MDSPAN_ENABLE_P3663=OFF(default)MDSPAN_ENABLE_P3663=ON(this may require significant back-porting effort)constant_wrapperin C++17 or less. Even the C++20 "back-port" is so incomplete that it might as well be called "integral_constant." We could either just useintegral_constantfor C++20 or less, or name the typedetail::constant_wrapper.MDSPAN_ENABLE_P3663=ONthe default (retain the option so that Kokkos::View has the chance to turn it off if it affects performance)std::constant_wrapperconstant_wrapperback-port, as Back-port is only useful for a small range of very recent GCC and Clang implementations-std=c++26; we know it implementsstd::constant_wrapper, because Compiler Explorer's version of GCC trunk has it.dims) fix has already been applied by PR move padded layouts and dims from Experimental to the standard namespace #447 . That PR will remove import ofdimsinto theexperimentalnamespace. This PR should be changed not to do that. (There's no promise of backwards compatibility for theexperimentalnamespace.)MDSPAN_ENABLE_P3663not defined in the code?Feature is ON by default. To disable it, set the CMake option
MDSPAN_ENABLE_P3663toOFF.Over various commits we've simplified the implementation so that it works with C++17 and subsequent C++ versions. P3663 relies on
std::constant_wrapper; we've tested with a version of GCC that implements it, so we can say with some confidence that it "forward-ports" all the way to C++26.