setup: delete inappropriate uses of hybrid programs by minsii · Pull Request #17 · manjugv/specification

minsii · 2020-02-05T20:25:36Z

The notes for shmem_pe_accessible and shmem_addr_accessible have several issues as described below. Thus, I'd suggest we delete these notes.

shmem_pe_accessible:

...when an MPI job uses Multiple Program Multiple Data (MPMD) mode, 
multiple executable MPI programs are executed as part of the same MPI job. 
In such cases, OpenSHMEM support may only be available between processes 
running from the same executable file.

In a hybrid program, only processes that have initialized SHMEM can have valid PE numbers, and thus can be checked by shmem_pe_accessible.

In addition, some environments may allow a hybrid job to span multiple network partitions. 
In such scenarios, OpenSHMEM support may only be available between PEs within the 
same partition.

Same issue here, it is unclear how we can specify the PE of a remote process that exists in a different network partition. E.g., the PE numbering of two partitions can be {0,1,2} + {0,1,2}, or {0,1,2} + {3,4,5}. If P0 in the first partition checks the accessibility of a process in the second partition, the result is always TRUE in the former case.

shmem_addr_accessible

...when an MPI job uses MPMD mode, multiple executable MPI programs may use OpenSHMEM 
routines. In such cases, static memory, such as a C global variable, is symmetric between 
processes running from the same executable file, but is not symmetric between processes 
running from different executable files....

It is unclear how we can specify the SHMEM PE of a process that initializes only MPI.

In a hybrid SHMEM+MPI program, only processes that have initialized SHMEM can have valid PE numbers, and thus can be checked by shmem_addr_accessible and shmem_pe_accessible.

minsii · 2020-02-05T20:47:47Z

@agrippa @manjugv @swpoole @jamesaross @tonycurtis can you please review?

tonycurtis · 2020-02-05T20:50:55Z

Outside of the original SGI environment, shmem_pe_accessible still seems to have potential use e.g. as a simple fault tolerance check. shmem_addr_accessible could be used as an is-it-symmetric? check.

Agree on dropping old text.

agrippa · 2020-02-05T20:56:35Z

I think this text is saying that you might (for example) have a job with different executables in different nodes, but which all call shmem_init(). As a result, they might all receive a valid PE # from SHMEM but programs running different executables will not be able to access each other's symmetric data (e.g. global variables).

If these sorts of use cases are fundamentally invalid today, I think a change is needed but I think we need to replace it with other examples. It is not readily apparent how these routines are useful (particularly shmem_pe_accessible), and so some examples in the notes would be useful.

minsii · 2020-02-05T21:00:24Z

@tonycurtis shmem_addr_accessible returns TRUE also for an address that can be accessed by OpenSHMEM routines.

The return value is 1 if the local address addr is also a symmetric address and
the given data object is accessible via OpenSHMEM routines on the specified remote PE;

I actually do not know how the user can use shmem_addr_accessible as shmem_pe_accessible already validates the support of symmetric data objects.

minsii · 2020-02-05T21:06:06Z

@agrippa The #PE across SHMEM partitions might not be unique. E.g., 6 processes are initialized as two SHMEM partitions with #PE {0,1,2} and {0,1,2}. If P0 in the first partition checks the accessibility of a process in the second partition, the result will be always TRUE, which is inaccurate.

To make the example valid, we need to find a way to get unique #PE across partitions. But SHMEM does not provide such a functionality.

agrippa · 2020-02-05T21:07:18Z

@minsii A SHMEM program with non-unique PE IDs is not a compliant SHMEM program, so that isn't a case I'm worried about.

tonycurtis · 2020-02-05T21:08:41Z

Maybe shmem_pe_accessible could turn into a liveness check? Do you know if people use these in the wild?

agrippa · 2020-02-05T21:11:07Z

Maybe we need to bring the usefulness of these APIs up as a discussion item.

minsii · 2020-02-05T21:28:43Z

@agrippa I think we considered the SHMEM initialization in different ways. In my example, the two partitions are initialized as isolated environments, thus it guarantees unique #PE only inside a partition. Nevertheless, I think the notes are already confusing and the examples are unusual. It might be better to replace them with better examples, such as liveness check as suggested by @tonycurtis

manjugv · 2020-02-05T21:30:02Z

We should bring this up at the plenary and see if this interfaces need to be deprecated/removed.

tonycurtis · 2020-02-05T21:33:38Z

On Feb 5, 2020, at 4:28 PM, Min Si ***@***.***> wrote: @agrippa <https://github.com/agrippa> I think we considered the SHMEM initialization in different ways. In my example, the two partitions are initialized as isolated environments, thus it guarantees unique #PE only inside a partition. Nevertheless, I think the notes are already confusing and the examples are unusual. It might be better to replace them with better examples, such as liveness check as suggested by @tonycurtis <https://github.com/tonycurtis>

Of course, if I think a PE is alive “now”, it doesn’t mean it necessarily will be when I later try to send something to it. Bit of a race hazard. Maybe the addr_accessible routine is a better fit as it implies that some communication is imminent, which I guess would be the main reason to check. Tony

jdinan · 2020-02-10T15:42:23Z

Please do not add "Notes: none" sections. Just delete these sections, per openshmem-org#330.

minsii · 2020-02-10T16:51:10Z

@jdinan Thanks, will fix.

I was not able to catch the conclusion at F2F meeting. Can someone please remind me which of the following options we want to go for 1.5?

Deprecate shmem_pe_accessible or shmem_addr_accessible, or both
Delete the example notes for both routines
Replace the examples with more common use cases for both routines
No change

manjugv · 2020-02-11T00:38:30Z

@minsii There was no action item for 1.5. We will address this more throughly for 1.6.

Fix prototype type typos in deprecated reductions

Changelog: Reorder removal of SHMEM_CACHE

RM data types from memory ordering figures

Improve use of "non" vs. "non-"

naveen-rn · 2024-05-30T17:56:52Z

@manjugv Should we assign this to someone else - looks like easy to change.

jdinan · 2024-09-26T18:39:24Z

The change to shmem_addr_accessible looks fine. I think there are more fundamental questions about shmem_pe_accessible and perhaps the most sensible course of action would be to deprecate the function.

setup: delete inappropriate uses of hybrid programs

801e098

In a hybrid SHMEM+MPI program, only processes that have initialized SHMEM can have valid PE numbers, and thus can be checked by shmem_addr_accessible and shmem_pe_accessible.

tonycurtis approved these changes Feb 5, 2020

View reviewed changes

manjugv mentioned this pull request Feb 5, 2020

Clarify notes or remove shmem_pe_accesible and shmem_addr_accessible openshmem-org/specification#360

Open

manjugv added the OpenSHMEM-1.6 label Feb 14, 2020

manjugv self-assigned this Jun 8, 2020

manjugv pushed a commit that referenced this pull request Oct 26, 2020

Merge pull request #17 from nspark/fix/reduce-typo

c6b1204

Fix prototype type typos in deprecated reductions

manjugv pushed a commit that referenced this pull request Oct 26, 2020

Merge pull request #17 from BryantLam/changelog-reorder-shmem_cache

47da29e

Changelog: Reorder removal of SHMEM_CACHE

manjugv pushed a commit that referenced this pull request Oct 26, 2020

Merge pull request #17 from naveen-rn/rm-dt-mo-fig

afb0edc

RM data types from memory ordering figures

manjugv pushed a commit that referenced this pull request Oct 26, 2020

Merge pull request #17 from nspark/fix/non

fddb357

Improve use of "non" vs. "non-"

Conversation

minsii commented Feb 5, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

minsii commented Feb 5, 2020

Uh oh!

tonycurtis commented Feb 5, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

agrippa commented Feb 5, 2020

Uh oh!

minsii commented Feb 5, 2020

Uh oh!

minsii commented Feb 5, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

agrippa commented Feb 5, 2020

Uh oh!

tonycurtis commented Feb 5, 2020

Uh oh!

agrippa commented Feb 5, 2020

Uh oh!

minsii commented Feb 5, 2020

Uh oh!

manjugv commented Feb 5, 2020

Uh oh!

tonycurtis commented Feb 5, 2020 via email

Uh oh!

jdinan commented Feb 10, 2020

Uh oh!

minsii commented Feb 10, 2020

Uh oh!

manjugv commented Feb 11, 2020

Uh oh!

naveen-rn commented May 30, 2024

Uh oh!

jdinan commented Sep 26, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

minsii commented Feb 5, 2020 •

edited

Loading

tonycurtis commented Feb 5, 2020 •

edited

Loading

minsii commented Feb 5, 2020 •

edited

Loading