Date: Wed, 2 Apr 2003 11:16:00 +0200 From: Jesper Larsson Traeff To: mpi-21@XXXXXXXXXXXXX Cc: Jesper Larsson Traeff Subject: Re: Correction to One-sided communications, Section 6.7: Semantics and Correctness References: <20030401125309.GA21013@XXXXXXXXXXXXXXXXXXXX> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4i X-Virus-Scanned: by amavisd-milter (http://amavis.org/ X-Spam-Status: No, hits=-3.0 required=5.0 tests=IN_REP_TO,DEAR_SOMEBODY,GUARANTEE version=2.21 X-Spam-Level: Sender: owner-mpi-21@XXXXXXXXXXXXX Precedence: bulk Reply-To: mpi-21@XXXXXXXXXXXXX Dear Rajeev and Dick, I see your points. However, I still think that there's a mistake in the standard. My reasoning is as follows: Intuitively, the semantic rules states that the one-sided model and its different synchronization modes can be used for communication as one would expect: updates performed in epoch i are properly visible in epoch i+1. No more than that is said; and to give implementers maximum freedom, the *latest* point in which public updates to private memory should become visible (and vice versa) is defined. The rules also guarantee that the three synchronization modes (fence, post-wait, lock) can be *mixed freely*. Now, I think rules 5 and 6 (with respect to post-wait) specify too late a time for an update for the above to hold: On Tue, Apr 01, 2003 at 11:12:16AM -0600, Rajeev Thakur wrote: > Jesper, > I think the wording is correct the way it is. Rule 5 specifies when a > local update made by a process to its own window becomes visible to others. > In the case of post-wait synchronization, it becomes visible at the latest > when MPI_Win_post is called by that process. MPI_Win_post "exposes" the > local window to other processes, so it is right to say that the local update > must have occured before post is called, and post makes it visible to other > processes. > In post-wait epoch i process A makes private updates; the epoch is ended with MPI_WIN_WAIT. In epoch i+1 another process accesses memory of process A, using MPI_WIN_LOCK to open epcoh i+1. As per rule 5 (as it is now) IT IS NOT GUARANTEED THAT THE PRIVATE UPDATES ARE VISIBLE IN THE PUBLIC WINDOW I therefore think that MPI_WIN_WAIT (the one that ends epoch i) should enforce that the private update becomes publicly visible Another problem with rule 5 (as it is now, regarding only MPI_WIN_POST), is that it somewhat contradicts rule 3, which says that WIN_WAIT completes updates on the target. The private updates could have been done by MPI_PUT's, and thus, according to rule 3, should be completed by the WAIT > Similarly, Rule 6 specifies when the update made by a remote process becomes > visible to the local process. In the case of post-wait synchronization, > clearly, it is MPI_Win_wait that "completes" the operation, and therefore > only after win_wait is called can one expect the update made by the remote > process to be visible to the local process. > By the same reasoning as above, I think MPI_WIN_WAIT is too late. Imagine a process doing an update on process A's private window in lock-epoch i. Epoch i is ended by MPI_WIN_UNLOCK. If process A now opens epoch i+1 (and exposes its window to itself) with MPI_WIN_POST, rule 6 as it is now, does not guarantee that the update on A's private window is indeed visible to A in epcoh i+1. It seems that MPI_WIN_POST must enforce that the public update in epoch i is privately visible in epcoh i+1 ---- The rules, esp. regarding lock are really admiringly cleverly thought out, and makes a certain implementation of lock-unlock synchronization possible on architectures that are not cache-coherent. I think the same must have been intended for post-wait synchronization, as I tried to argue above. If I understand the intention of the rules correctly, the subtle point is that window-communication does not immediately fit with point-to-point and collective communication. In order to access data in a window, a user must reallly open an epoch, and do the access there; otherwise it is not guaranteed that private/public copies are consistent. In particular, ending a phase in an application where one-sided communication has been done with a Barrier, does not ensure consistency!!! One possible recommendation is to end such a phase with a MPI_WIN_FENCE(MODE_NO_SUCCEED...) instead of the barrier. If I'm right in my interpretation, I think some "advice to users" or a clarifying example would be in order Jesper > Rajeev > > > > > -----Original Message----- > > From: owner-mpi-21@mpi-forum.org [mailto:owner-mpi-21@mpi-forum.org]On > > Behalf Of Jesper Larsson Traeff > > Sent: Tuesday, April 01, 2003 6:53 AM > > To: mpi-21@mpi-forum.org > > Subject: Correction to One-sided communications, Section 6.7: Semantics > > and Correctness > > > > > > > > Dear MPI-2 Group, > > > > the list has been silent for quite some time. Here's an error (I > > think) and > > an issue concerning semantics of one-sided communications, that might > > reactivate discussions. > > > > MPI-2, Section 6.7: Semantics and Correctness of one-sided communications > > > > The rules on p.138 explain when operations are to complete at origins and > > targets, and in particular when updates become visible in public > > and private > > copies of windows. > > > > To my understanding: > > Rule 5 should state that an update in a location in a private window copy > > of a process becomes visible in the public window when that > > process perform > > a synchronization operation to end the current exposure epoch, that is an > > MPI_WIN_FENCE, MPI_WIN_UNLOCK or MPI_WIN_WAIT (and *not*, as written, > > MPI_WIN_POST) > > > > Rule 6 should state that an update to a public window becomes visible in > > the private window of the process owning the locations when the > > window owner > > opens the next exposure epoch, that is after the next MPI_WIN_FENCE, > > MPI_WIN_LOCK or MPI_WIN_POST (and *not*, as written, MPI_WIN_WAIT) > > > > Thus, it seems that in rules 5 and 6 the roles of MPI_WIN_POST > > and MPI_WIN_WAIT > > where falsely swapped. Is that so? > > > > The issue here is quite subtle and maybe some "advice to users" > > should be added. > > The MPI-2 standard apparently only guarantees eg. cache coherence > > (on a window) > > after the opening the next access epoch (Rule 6). Barrier > > synchronization or > > explicit synchronization with send/recv's alone does not > > guarantee this! This > > can likely confuse users who end, say, a phase in a application > > using MPI-lock > > synchronization with an explicit MPI_Barrier, and after the barrier expect > > all data now to be visible in the private window copies. In some MPI > > implementations or for some architectures that may not be so! > > > > Regards > > > > Jesper > > > >