From 1ea49f48edd93c00c3e08998fd57712f21c08c3c Mon Sep 17 00:00:00 2001 From: Nicolas Morales Date: Thu, 21 Mar 2024 16:18:28 +0900 Subject: [PATCH 01/11] initial thoughts on mdspan copy --- mdspan_copy/Makefile | 3 ++ mdspan_copy/mdspan_copy.md | 78 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 81 insertions(+) create mode 100644 mdspan_copy/Makefile create mode 100644 mdspan_copy/mdspan_copy.md diff --git a/mdspan_copy/Makefile b/mdspan_copy/Makefile new file mode 100644 index 00000000..655cdec6 --- /dev/null +++ b/mdspan_copy/Makefile @@ -0,0 +1,3 @@ +include ../P0009/wg21/Makefile + +.DEFAULT_GOAL := $(HTML) diff --git a/mdspan_copy/mdspan_copy.md b/mdspan_copy/mdspan_copy.md new file mode 100644 index 00000000..63854e90 --- /dev/null +++ b/mdspan_copy/mdspan_copy.md @@ -0,0 +1,78 @@ +--- +title: "Copy and fill for `mdspan`" +date: today +--- + +# Motivation + +C++23 introduced `mdspan` ([@P0009R18]), a nonowning multidmensional array abstraction that has a customizable layout. Layout customization was originally motivated in [@P0009R18] with considerations for interoperability and performance, particularly on different architectures. Moreover, [@P2630R4] introduced `submdspan`, a slicing function that can yield arbitrarily strided layouts. Without standard library support, copying efficiently between mdspans with mixes of complex layouts is challenging for users. + +Many applications, including high-performance computing (HPC), image processing, computer graphics, etc that benefit from `mdspan` also would benefit from basic memory operations provided in standard algorithms such as copy and fill. Indeed, the authors found that a copy algorithm would have been quite useful in their implementation of the copying `mdarray` ([@P1684]) constructor. + +However, existing standard library facilities are not sufficient here. Currently, `mdspan` does not have iterators or ranges that represent the span of the `mdspan`. Additionally, it's not entirely clear what this would entail. + +Moreover, the manner in which an `mdspan` is copied (or filled) is highly performance sensitive, particularly in regards to caching behavior when traversing mdspan memory. A naïve user implementation is easy to get wrong in addition to being tedious for higher rank `mdspan`s. Ideally, an implementation would be free to use information about the layout of the `mdspan` known at compile time to perform optimizations; e.g. a continuous span `mdspan` copy for trivial types could be implementeed with a `memcpy`. + +Finally, providing these generic algorithms would also enable these operations for types that are representable by `mdspan`. For example, this would naturally include `mdarray`, which is convertible to `mdspan`, or for user-defined types whose view of memory corresponds to `mdspans` (e.g. an image class or something similar). + +## Safety + +Due to the closed nature of `mdspan` extents, copy operations can be checked by the implementation to prevent past-the-end writes. This is an advantage the proposed copy operation has over the existing operations in the standard. + +# Design + +The main design direction of this proposal is to provide methods for copying and filling `mdspan`s that may have differing layouts and accessors, while allowing implementations to provide efficient implementations for special cases. For example, if a copy occurs between two `mdspan`s with the same layout mapping type that is contiguous and both use `default_accessor`, the intention is that this could be implemented by a single `memcpy`. + +Furthermore, accessors as a customization point should be enabled, as with any other `mdspan` operation. For example, a custom accessor that checks a condition inside of the `access` method should still work and check that condition. It's worth noting that there may be a high sensitivity of how much implementations able to optimize if provided custom accessors. For example, optimizations could be disabled if using a custom accessor that is identical to the default accessor. + +Finally, there is some question as to whether `copy` and `fill` should return a value when applied to `mdspan`, as the iterator and ranged-based algorithms do. We believe that `mdspan` copy and fill should return void, as there is no past-the-end iterator that they could reasonable return. + +## What the proposal does not include + +* `std::move`: Perhaps this should be included for completeness's sake. However, it doesn't seem applicable to the typical usage of `mdspan`. +* `(copy|fill)_n`: As a multidimensional view `mdspan` does not in general follow a specific ordering. Memory ordering may not be obvious to calling code, so it's not even clear how these would work. Any applications intending to copy a subset of `mdspan` should use call `copy` on the result of `submdspan`. +* `copy_backward`: As above, there is no specific ordering. A similar effect could be achieved via transformations with a custom layout, similar to `layout_transpose` in [@P1673]. +* Other algorithms, include `std::for_each`. `for_each` in particular is a desirable but brings in many unanswered questions that should be addressed in a different paper. + +# Wording + +```c++ +template +void copy(const mdspan &src, const mdspan &dst) + +template +void copy(ExecutionPolicy&& policy, const mdspan &src, const mdspan &dst) +``` + +[1]{.pnum} *Constraints:* + + * [1.1]{.pnum} `std::is_assignable_v::reference, typename mdspan::reference>` + + * [1.2]{.pnum} `mdspan::rank() == mdspan::rank()` + +[2]{.pnum} *Preconditions:* + + * [2.1]{.pnum} `src.extent(r) <= dst.extent(r)` for every rank index `r` of `dst` + + * [2.2]{.pnum} `dst.is_unique()` + + * [2.3]{.pnum} there is no unique multidimensional index `i...` in `src.extents()`, and no unique multidimensional index `j...` in `dst.extents()` such that `src.accessor().offset(src.data_handle(), src.mapping(i...)) == dst.accessor().offset(dst.data_handle(), dst.mapping(j...))` + +[3]{.pnum} *Effects:* for all unique multidimensional indices `i...` in `src.extents()`, assigns `src[i...]` to `dst[i...]` + + +```c++ +template +void fill(const mdspan &dst, const T &value) + +template +void fill(ExecutionPolicy&& policy, const mdspan &dst, const T &value) +``` + +[4]{.pnum} *Constraints:* `std::is_assignable_v::reference, const T &T>` + +[5]{.pnum} *Preconditions:* `dst.is_unique()` + +[6]{.pnum} *Effects:* for all unique multidimensional indices `i...` in `dst.extents()`, assigns `value` to `dst[i...]` From e56c867ca4fee98d96aa16d03c8c3678bc93146b Mon Sep 17 00:00:00 2001 From: Nicolas Morales Date: Thu, 21 Mar 2024 17:16:23 +0900 Subject: [PATCH 02/11] minor fixes and rewording --- mdspan_copy/mdspan_copy.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/mdspan_copy/mdspan_copy.md b/mdspan_copy/mdspan_copy.md index 63854e90..721d179c 100644 --- a/mdspan_copy/mdspan_copy.md +++ b/mdspan_copy/mdspan_copy.md @@ -25,7 +25,7 @@ The main design direction of this proposal is to provide methods for copying and Furthermore, accessors as a customization point should be enabled, as with any other `mdspan` operation. For example, a custom accessor that checks a condition inside of the `access` method should still work and check that condition. It's worth noting that there may be a high sensitivity of how much implementations able to optimize if provided custom accessors. For example, optimizations could be disabled if using a custom accessor that is identical to the default accessor. -Finally, there is some question as to whether `copy` and `fill` should return a value when applied to `mdspan`, as the iterator and ranged-based algorithms do. We believe that `mdspan` copy and fill should return void, as there is no past-the-end iterator that they could reasonable return. +Finally, there is some question as to whether `copy` and `fill` should return a value when applied to `mdspan`, as the iterator and ranged-based algorithms do. We believe that `mdspan` copy and fill should return void, as there is no past-the-end iterator that they could reasonably return. ## What the proposal does not include @@ -54,11 +54,11 @@ void copy(ExecutionPolicy&& policy, const mdspan Date: Fri, 22 Mar 2024 09:13:30 +0900 Subject: [PATCH 03/11] Apply suggestions from code review Co-authored-by: Mark Hoemmen --- mdspan_copy/mdspan_copy.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/mdspan_copy/mdspan_copy.md b/mdspan_copy/mdspan_copy.md index 721d179c..a9740744 100644 --- a/mdspan_copy/mdspan_copy.md +++ b/mdspan_copy/mdspan_copy.md @@ -39,18 +39,18 @@ Finally, there is some question as to whether `copy` and `fill` should return a ```c++ template -void copy(const mdspan &src, const mdspan &dst) +void copy(mdspan src, mdspan dst); template -void copy(ExecutionPolicy&& policy, const mdspan &src, const mdspan &dst) +void copy(ExecutionPolicy&& policy, mdspan src, mdspan dst); ``` [1]{.pnum} *Constraints:* - * [1.1]{.pnum} `std::is_assignable_v::reference, typename mdspan::reference>` + * [1.1]{.pnum} `std::is_assignable_v::reference, typename mdspan::reference>` is `true`. - * [1.2]{.pnum} `mdspan::rank() == mdspan::rank()` + * [1.2]{.pnum} `mdspan::rank()` equals `mdspan::rank()`. [2]{.pnum} *Preconditions:* @@ -65,10 +65,10 @@ void copy(ExecutionPolicy&& policy, const mdspan -void fill(const mdspan &dst, const T &value) +void fill(mdspan dst, const T& value); template -void fill(ExecutionPolicy&& policy, const mdspan &dst, const T &value) +void fill(ExecutionPolicy&& policy, mdspan dst, const T& value); ``` [4]{.pnum} *Constraints:* `std::is_assignable_v::reference, const T &T>` From 6a4a3617d1efc680350df44a02dcf012b3088d4b Mon Sep 17 00:00:00 2001 From: Nicolas Morales Date: Fri, 22 Mar 2024 10:47:17 +0900 Subject: [PATCH 04/11] add section on std::linalg copy and header location --- P0009/wg21/data/index.yaml | 32 ++ mdspan_copy/mdspan_copy.html | 668 +++++++++++++++++++++++++++++++++++ mdspan_copy/mdspan_copy.md | 45 ++- 3 files changed, 739 insertions(+), 6 deletions(-) create mode 100644 mdspan_copy/mdspan_copy.html diff --git a/P0009/wg21/data/index.yaml b/P0009/wg21/data/index.yaml index f321068a..642f06aa 100644 --- a/P0009/wg21/data/index.yaml +++ b/P0009/wg21/data/index.yaml @@ -83181,6 +83181,14 @@ references: issued: year: 2019 URL: https://wg21.link/p0009r9 + - id: P0009R18 + citation-label: P0009R18 + title: "mdspan" + author: + - family: Christian Trott, D.S. Hollman, Damien Lebrun-Grandie, Mark Hoemmen, Daniel Sunderland, H. Carter Edwards, Bryce Adelstein Lelbach, Mauro Bianco, Ben Sander, Athanasios Iliopoulos, John Michopoulos, Nevin Liber + issued: + year: 2022 + URL: https://wg21.link/p0009r18 - id: P0010R0 citation-label: P0010R0 title: "Adding a subsection for concurrent random number generation in C++17" @@ -104753,6 +104761,14 @@ references: issued: year: 2019 URL: https://wg21.link/p1673r1 + - id: P1673R13 + citation-label: P1673R13 + title: "A free function linear algebra interface based on the BLAS" + author: + - family: Mark Hoemmen, Daisy Hollman, Christian Trott, Daniel Sunderland, Nevin Liber, Alicia Klinvex, Li-Ta Lo, Damien Lebrun-Grandie, Graham Lopez, Peter Caday, Sarah Knepper, Piotr Luszczek, Timothy Costa + issued: + year: 2023 + URL: https://wg21.link/p1673r1 - id: P1674R0 citation-label: P1674R0 title: "Evolving a Standard C++ Linear Algebra Library from the BLAS" @@ -104889,6 +104905,14 @@ references: issued: year: 2019 URL: https://wg21.link/p1684r0 + - id: P1684R5 + citation-label: P1684R5 + title: "mdarray: An Owning Multidimensional Array Analog of mdspan" + author: + - family: Christian Trott, Daisy Hollman, Mark Hoemmen, Daniel Sunderland, Damien Lebrun-Grandie + issued: + year: 2023 + URL: https://wg21.link/p1684r5 - id: P1685R0 citation-label: P1685R0 title: "Make get/set_default_resource replaceable" @@ -107369,6 +107393,14 @@ references: issued: year: 2019 URL: https://wg21.link/p1999r0 + - id: P2630R4 + citation-label: P2630R4 + title: "Submdspan" + author: + - family: Christian Trott, Damien Lebrun-Grandie, Mark Hoemmen, Nevin Liber + issued: + year: 2023 + URL: https://wg21.link/p2630r4 - id: P3141 citation-label: P3141 title: "std::terminates()" diff --git a/mdspan_copy/mdspan_copy.html b/mdspan_copy/mdspan_copy.html new file mode 100644 index 00000000..c58b5ea1 --- /dev/null +++ b/mdspan_copy/mdspan_copy.html @@ -0,0 +1,668 @@ + + + + + + + + Copy and fill for mdspan + + + + + + + + +
+
+

Copy and fill for +mdspan

+ + + + + + + + + + + + + + + + + + +
Document #:
Date: 2024-03-22
Project: Programming Language C++
+
Reply-to: + Nicolas Morales
<>
+ Christian Trott
<>
+ Mark Hoemmen
<>
+ Damien Lebrun-Grandie
<>
+
+ +
+
+ +

1 Motivation

+

C++23 introduced mdspan ([P0009R18]), a nonowning multidmensional +array abstraction that has a customizable layout. Layout customization +was originally motivated in [P0009R18] with considerations for +interoperability and performance, particularly on different +architectures. Moreover, [P2630R4] introduced +submdspan, a slicing function that can yield arbitrarily +strided layouts. However, without standard library support, copying +efficiently between mdspans with mixes of complex layouts is challenging +for users.

+

Many applications, including high-performance computing (HPC), image +processing, computer graphics, etc that benefit from mdspan +also would benefit from basic memory operations provided in standard +algorithms such as copy and fill. Indeed, the authors found that a copy +algorithm would have been quite useful in their implementation of the +copying mdarray ([P1684R5]) constructor. A more +constrained form of copy is also included in the standard +linear algebra library ([P1673R13]).

+

However, existing standard library facilities are not sufficient +here. Currently, mdspan does not have iterators or ranges +that represent the span of the mdspan. Additionally, it’s +not entirely clear what this would entail. +std::linalg::copy ([P1673R13]) is limited to +mdspans of rank 2 or lower.

+

Moreover, the manner in which an mdspan is copied (or +filled) is highly performance sensitive, particularly in regards to +caching behavior when traversing mdspan memory. A naïve user +implementation is easy to get wrong in addition to being tedious for +higher rank mdspans. Ideally, an implementation would be +free to use information about the layout of the mdspan +known at compile time to perform optimizations; e.g. a continuous span +mdspan copy for trivial types could be implementeed with a +memcpy.

+

Finally, providing these generic algorithms would also enable these +operations for types that are representable by mdspan. For +example, this would naturally include mdarray, which is +convertible to mdspan, or for user-defined types whose view +of memory corresponds to mdspans (e.g. an image class or +something similar).

+

1.1 Safety

+

Due to the closed nature of mdspan extents, copy +operations can be checked by the implementation to prevent past-the-end +writes. This is an advantage the proposed copy operation has over the +existing operations in the standard.

+

2 Design

+

The main design direction of this proposal is to provide methods for +copying and filling mdspans that may have differing layouts +and accessors, while allowing implementations to provide efficient +implementations for special cases. For example, if a copy occurs between +two mdspans with the same layout mapping type that is +contiguous and both use default_accessor, the intention is +that this could be implemented by a single memcpy.

+

Furthermore, accessors as a customization point should be enabled, as +with any other mdspan operation. For example, a custom +accessor that checks a condition inside of the access +method should still work and check that condition. It’s worth noting +that there may be a high sensitivity of how much implementations able to +optimize if provided custom accessors. For example, optimizations could +be disabled if using a custom accessor that is identical to the default +accessor.

+

Finally, there is some question as to whether copy and +fill should return a value when applied to +mdspan, as the iterator and ranged-based algorithms do. We +believe that mdspan copy and fill should return void, as +there is no past-the-end iterator that they could reasonably return.

+ +

Currently, we are proposing adding copy and +fill algorithms on mdspan to header +<mdspan>. We considered other options, namely:

+
    +
  • <algorithm>: This would mean that users of +iterator-based algorithms would need to pull in +<mdspan>. On the other hand, this is where +iterator-based copy and fill live so may be +preferable in that sense.
  • +
  • <mdspan_algorithm> (or similarly any other new +header): This seems like overkill for two functions. However, in the +future, we may want to add new algorithms for mdspan that +are not strictly covered by existing algorithms in +<algorithm>, so this option may be more future +proof.
  • +
+

We settled on <mdspan> because as proposed this is +a relatively light-weight addition that reflects operations that are +commonly desired with mdspans. However, the authors are +open to changing this.

+

2.2 Existing copy in +std::linalg

+

[P1673R13] introduced several linear +algebra operations including std::linalg::copy. This +operation only applies to mdspans with rank ≤ 2. +This paper is proposing a version of copy that is +constrained to a superset of std::linalg::copy.

+

Right now the strict addition of copy would potentially +cause the following code to be ambiguous, due to ADL-finding +std::copy:

+
using std::linalg::copy;
+copy(mds1, mds2);
+

One possibility would be to remove std::linalg::copy, as +it is a subset of the proposed std::copy, though as of now +this paper does not propose to do this.

+

2.3 What the proposal does not +include

+
    +
  • std::move: Perhaps this should be included for +completeness’s sake. However, it doesn’t seem applicable to the typical +usage of mdspan.
  • +
  • (copy|fill)_n: As a multidimensional view +mdspan does not in general follow a specific ordering. +Memory ordering may not be obvious to calling code, so it’s not even +clear how these would work. Any applications intending to copy a subset +of mdspan should use call copy on the result +of submdspan.
  • +
  • copy_backward: As above, there is no specific ordering. +A similar effect could be achieved via transformations with a custom +layout, similar to layout_transpose in [P1673R13].
  • +
  • Other algorithms, include std::for_each. +for_each in particular is a desirable but brings in many +unanswered questions that should be addressed in a different paper.
  • +
+

3 Wording

+
template<class SrcElementType, class SrcExtents, class SrcLayoutPolicy, class SrcAccessorPolicy,
+         class DstElementType, class DstExtents, class DstLayoutPolicy, class DstAccessorPolicy>
+void copy(mdspan<SrcElementType, SrcExtents, SrcLayoutPolicy, SrcAccessorPolicy> src, 
+          mdspan<DstElementType, DstExtents, DstLayoutPolicy, DstAccessorPolicy> dst);
+
+template<class ExecutionPolicy, class SrcElementType, class SrcExtents, class SrcLayoutPolicy, class SrcAccessorPolicy,
+         class DstElementType, class DstExtents, class DstLayoutPolicy, class DstAccessorPolicy>
+void copy(ExecutionPolicy&& policy, mdspan<SrcElementType, SrcExtents, SrcLayoutPolicy, SrcAccessorPolicy> src,
+          mdspan<DstElementType, DstExtents, DstLayoutPolicy, DstAccessorPolicy> dst);
+

1 +Constraints:

+
    +
  • (1.1) +std::is_assignable_v<typename mdspan<SrcElementType, SrcExtents, SrcLayoutPolicy, SrcAccessorPolicy>::reference, typename mdspan<DstElementType, DstExtents, DstLayoutPolicy, DstAccessorPolicy>::reference> +is true.

  • +
  • (1.2) +mdspan<SrcElementType, SrcExtents, SrcLayoutPolicy, SrcAccessorPolicy>::rank() +equals +mdspan<DstElementType, DstExtents, DstLayoutPolicy, DstAccessorPolicy>::rank().

  • +
+

2 +Preconditions:

+
    +
  • (2.1) +src.extents() == dst.extents()

  • +
  • (2.2) +dst.is_unique()

  • +
  • (2.3) +there is no unique multidimensional index i... in +src.extents() where there exists a multidimensional index +j... in dst.extents() such that +src[i...] and dst[j...] refer to the same +element.

  • +
+

3 +Effects: for all unique multidimensional indices +i... in src.extents(), assigns +src[i...] to dst[i...]

+
template<class ElementType, class Extents, class LayoutPolicy, class AccessorPolicy, class T>
+void fill(mdspan<ElementType, Extents, LayoutPolicy, AccessorPolicy> dst, const T& value);
+
+template<class ExecutionPolicy, class ElementType, class Extents, class LayoutPolicy, class AccessorPolicy, class T>
+void fill(ExecutionPolicy&& policy, mdspan<ElementType, Extents, LayoutPolicy, AccessorPolicy> dst, const T& value);
+

4 +Constraints: +std::is_assignable_v<typename mdspan<ElementType, Extents, LayoutPolicy, AccessorPolicy>::reference, const T &T>

+

5 +Preconditions: dst.is_unique()

+

6 +Effects: for all unique multidimensional indices +i... in dst.extents(), assigns +value to dst[i...]

+

4 References

+
+
+
[P0009R18]
Christian Trott, D.S. Hollman, Damien +Lebrun-Grandie, Mark Hoemmen, Daniel Sunderland, H. Carter Edwards, +Bryce Adelstein Lelbach, Mauro Bianco, Ben Sander, Athanasios +Iliopoulos, John Michopoulos, Nevin Liber. 2022. mdspan.
https://wg21.link/p0009r18
+
+
+
[P1673R13]
Mark Hoemmen, Daisy Hollman, Christian Trott, +Daniel Sunderland, Nevin Liber, Alicia Klinvex, Li-Ta Lo, Damien +Lebrun-Grandie, Graham Lopez, Peter Caday, Sarah Knepper, Piotr +Luszczek, Timothy Costa. 2023. A free function linear algebra interface +based on the BLAS.
https://wg21.link/p1673r1
+
+
+
[P1684R5]
Christian Trott, Daisy Hollman, Mark Hoemmen, +Daniel Sunderland, Damien Lebrun-Grandie. 2023. mdarray: An Owning +Multidimensional Array Analog of mdspan.
https://wg21.link/p1684r5
+
+
+
[P2630R4]
Christian Trott, Damien Lebrun-Grandie, Mark +Hoemmen, Nevin Liber. 2023. Submdspan.
https://wg21.link/p2630r4
+
+
+
+
+ + diff --git a/mdspan_copy/mdspan_copy.md b/mdspan_copy/mdspan_copy.md index a9740744..b89d289c 100644 --- a/mdspan_copy/mdspan_copy.md +++ b/mdspan_copy/mdspan_copy.md @@ -1,15 +1,24 @@ --- title: "Copy and fill for `mdspan`" date: today +author: + - name: Nicolas Morales + email: + - name: Christian Trott + email: + - name: Mark Hoemmen + email: + - name: Damien Lebrun-Grandie + email: --- # Motivation -C++23 introduced `mdspan` ([@P0009R18]), a nonowning multidmensional array abstraction that has a customizable layout. Layout customization was originally motivated in [@P0009R18] with considerations for interoperability and performance, particularly on different architectures. Moreover, [@P2630R4] introduced `submdspan`, a slicing function that can yield arbitrarily strided layouts. Without standard library support, copying efficiently between mdspans with mixes of complex layouts is challenging for users. +C++23 introduced `mdspan` ([@P0009R18]), a nonowning multidmensional array abstraction that has a customizable layout. Layout customization was originally motivated in [@P0009R18] with considerations for interoperability and performance, particularly on different architectures. Moreover, [@P2630R4] introduced `submdspan`, a slicing function that can yield arbitrarily strided layouts. However, without standard library support, copying efficiently between mdspans with mixes of complex layouts is challenging for users. -Many applications, including high-performance computing (HPC), image processing, computer graphics, etc that benefit from `mdspan` also would benefit from basic memory operations provided in standard algorithms such as copy and fill. Indeed, the authors found that a copy algorithm would have been quite useful in their implementation of the copying `mdarray` ([@P1684]) constructor. +Many applications, including high-performance computing (HPC), image processing, computer graphics, etc that benefit from `mdspan` also would benefit from basic memory operations provided in standard algorithms such as copy and fill. Indeed, the authors found that a copy algorithm would have been quite useful in their implementation of the copying `mdarray` ([@P1684R5]) constructor. A more constrained form of `copy` is also included in the standard linear algebra library ([@P1673R13]). -However, existing standard library facilities are not sufficient here. Currently, `mdspan` does not have iterators or ranges that represent the span of the `mdspan`. Additionally, it's not entirely clear what this would entail. +However, existing standard library facilities are not sufficient here. Currently, `mdspan` does not have iterators or ranges that represent the span of the `mdspan`. Additionally, it's not entirely clear what this would entail. `std::linalg::copy` ([@P1673R13]) is limited to `mdspans` of rank 2 or lower. Moreover, the manner in which an `mdspan` is copied (or filled) is highly performance sensitive, particularly in regards to caching behavior when traversing mdspan memory. A naïve user implementation is easy to get wrong in addition to being tedious for higher rank `mdspan`s. Ideally, an implementation would be free to use information about the layout of the `mdspan` known at compile time to perform optimizations; e.g. a continuous span `mdspan` copy for trivial types could be implementeed with a `memcpy`. @@ -27,11 +36,33 @@ Furthermore, accessors as a customization point should be enabled, as with any o Finally, there is some question as to whether `copy` and `fill` should return a value when applied to `mdspan`, as the iterator and ranged-based algorithms do. We believe that `mdspan` copy and fill should return void, as there is no past-the-end iterator that they could reasonably return. +## Header + +Currently, we are proposing adding `copy` and `fill` algorithms on `mdspan` to header ``. We considered other options, namely: + +* ``: This would mean that users of iterator-based algorithms would need to pull in ``. On the other hand, this is where iterator-based `copy` and `fill` live so may be preferable in that sense. +* `` (or similarly any other new header): This seems like overkill for two functions. However, in the future, we may want to add new algorithms for `mdspan` that are not strictly covered by existing algorithms in ``, so this option may be more future proof. + +We settled on `` because as proposed this is a relatively light-weight addition that reflects operations that are commonly desired with `mdspan`s. However, the authors are open to changing this. + +## Existing `copy` in `std::linalg` + +[@P1673R13] introduced several linear algebra operations including `std::linalg::copy`. This operation only applies to `mdspan`s with $rank \le 2$. This paper is proposing a version of `copy` that is constrained to a superset of `std::linalg::copy`. + +Right now the strict addition of `copy` would potentially cause the following code to be ambiguous, due to ADL-finding `std::copy`: + +```c++ +using std::linalg::copy; +copy(mds1, mds2); +``` + +One possibility would be to remove `std::linalg::copy`, as it is a subset of the proposed `std::copy`, though as of now this paper does not propose to do this. + ## What the proposal does not include * `std::move`: Perhaps this should be included for completeness's sake. However, it doesn't seem applicable to the typical usage of `mdspan`. * `(copy|fill)_n`: As a multidimensional view `mdspan` does not in general follow a specific ordering. Memory ordering may not be obvious to calling code, so it's not even clear how these would work. Any applications intending to copy a subset of `mdspan` should use call `copy` on the result of `submdspan`. -* `copy_backward`: As above, there is no specific ordering. A similar effect could be achieved via transformations with a custom layout, similar to `layout_transpose` in [@P1673]. +* `copy_backward`: As above, there is no specific ordering. A similar effect could be achieved via transformations with a custom layout, similar to `layout_transpose` in [@P1673R13]. * Other algorithms, include `std::for_each`. `for_each` in particular is a desirable but brings in many unanswered questions that should be addressed in a different paper. # Wording @@ -39,11 +70,13 @@ Finally, there is some question as to whether `copy` and `fill` should return a ```c++ template -void copy(mdspan src, mdspan dst); +void copy(mdspan src, + mdspan dst); template -void copy(ExecutionPolicy&& policy, mdspan src, mdspan dst); +void copy(ExecutionPolicy&& policy, mdspan src, + mdspan dst); ``` [1]{.pnum} *Constraints:* From 87390cd3b91e39cc44da9330b77336836eb86ee3 Mon Sep 17 00:00:00 2001 From: Nicolas Morales Date: Mon, 1 Apr 2024 11:04:12 -0700 Subject: [PATCH 05/11] minor typo fixes and clarify differences with std::linalg::copy --- P0009/wg21/data/index.yaml | 2 +- mdspan_copy/mdspan_copy.html | 64 ++++++++++++++++++++++++------------ mdspan_copy/mdspan_copy.md | 24 +++++++++----- 3 files changed, 60 insertions(+), 30 deletions(-) diff --git a/P0009/wg21/data/index.yaml b/P0009/wg21/data/index.yaml index 642f06aa..cf88f9c7 100644 --- a/P0009/wg21/data/index.yaml +++ b/P0009/wg21/data/index.yaml @@ -104768,7 +104768,7 @@ references: - family: Mark Hoemmen, Daisy Hollman, Christian Trott, Daniel Sunderland, Nevin Liber, Alicia Klinvex, Li-Ta Lo, Damien Lebrun-Grandie, Graham Lopez, Peter Caday, Sarah Knepper, Piotr Luszczek, Timothy Costa issued: year: 2023 - URL: https://wg21.link/p1673r1 + URL: https://wg21.link/p1673r13 - id: P1674R0 citation-label: P1674R0 title: "Evolving a Standard C++ Linear Algebra Library from the BLAS" diff --git a/mdspan_copy/mdspan_copy.html b/mdspan_copy/mdspan_copy.html index c58b5ea1..825dd1ee 100644 --- a/mdspan_copy/mdspan_copy.html +++ b/mdspan_copy/mdspan_copy.html @@ -4,7 +4,7 @@ - + Copy and fill for mdspan