setup: delete inappropriate uses of hybrid programs#17
setup: delete inappropriate uses of hybrid programs#17minsii wants to merge 1 commit intomanjugv:section/library_setupfrom
Conversation
In a hybrid SHMEM+MPI program, only processes that have initialized SHMEM can have valid PE numbers, and thus can be checked by shmem_addr_accessible and shmem_pe_accessible.
|
@agrippa @manjugv @swpoole @jamesaross @tonycurtis can you please review? |
|
Outside of the original SGI environment, Agree on dropping old text. |
|
I think this text is saying that you might (for example) have a job with different executables in different nodes, but which all call shmem_init(). As a result, they might all receive a valid PE # from SHMEM but programs running different executables will not be able to access each other's symmetric data (e.g. global variables). If these sorts of use cases are fundamentally invalid today, I think a change is needed but I think we need to replace it with other examples. It is not readily apparent how these routines are useful (particularly shmem_pe_accessible), and so some examples in the notes would be useful. |
|
@tonycurtis I actually do not know how the user can use |
|
@agrippa The #PE across SHMEM partitions might not be unique. E.g., 6 processes are initialized as two SHMEM partitions with #PE {0,1,2} and {0,1,2}. If P0 in the first partition checks the accessibility of a process in the second partition, the result will be always TRUE, which is inaccurate. To make the example valid, we need to find a way to get unique #PE across partitions. But SHMEM does not provide such a functionality. |
|
@minsii A SHMEM program with non-unique PE IDs is not a compliant SHMEM program, so that isn't a case I'm worried about. |
|
Maybe |
|
Maybe we need to bring the usefulness of these APIs up as a discussion item. |
|
@agrippa I think we considered the SHMEM initialization in different ways. In my example, the two partitions are initialized as isolated environments, thus it guarantees unique #PE only inside a partition. Nevertheless, I think the notes are already confusing and the examples are unusual. It might be better to replace them with better examples, such as liveness check as suggested by @tonycurtis |
|
We should bring this up at the plenary and see if this interfaces need to be deprecated/removed. |
|
On Feb 5, 2020, at 4:28 PM, Min Si ***@***.***> wrote:
@agrippa <https://github.com/agrippa> I think we considered the SHMEM initialization in different ways. In my example, the two partitions are initialized as isolated environments, thus it guarantees unique #PE only inside a partition. Nevertheless, I think the notes are already confusing and the examples are unusual. It might be better to replace them with better examples, such as liveness check as suggested by @tonycurtis <https://github.com/tonycurtis>
Of course, if I think a PE is alive “now”, it doesn’t mean it necessarily will be when I later try to send something to it. Bit of a race hazard.
Maybe the addr_accessible routine is a better fit as it implies that some communication is imminent, which I guess would be the main reason to check.
Tony
|
|
Please do not add "Notes: none" sections. Just delete these sections, per openshmem-org#330. |
|
@jdinan Thanks, will fix. I was not able to catch the conclusion at F2F meeting. Can someone please remind me which of the following options we want to go for 1.5?
|
|
@minsii There was no action item for 1.5. We will address this more throughly for 1.6. |
Fix prototype type typos in deprecated reductions
Changelog: Reorder removal of SHMEM_CACHE
RM data types from memory ordering figures
Improve use of "non" vs. "non-"
|
@manjugv Should we assign this to someone else - looks like easy to change. |
|
The change to |
The notes for
shmem_pe_accessibleandshmem_addr_accessiblehave several issues as described below. Thus, I'd suggest we delete these notes.In a hybrid program, only processes that have initialized SHMEM can have valid PE numbers, and thus can be checked by
shmem_pe_accessible.Same issue here, it is unclear how we can specify the PE of a remote process that exists in a different network partition. E.g., the PE numbering of two partitions can be {0,1,2} + {0,1,2}, or {0,1,2} + {3,4,5}. If P0 in the first partition checks the accessibility of a process in the second partition, the result is always TRUE in the former case.
It is unclear how we can specify the SHMEM PE of a process that initializes only MPI.