At 08:04 AM 3/17/2004, Bill Pugh wrote:
>OK, a question has come up regarding the semantics of volatile.
>There is a point on which there are two different interpretations.
>As far as the model goes, it can handle either interpretation.
>
>So we'd like to open the question up to discussion and debate by
>the broader JMM community.
>
>For each volatile, there is a total order over all accesses to that
>volatile.
>
>Strong interpretation:
>         There is a happens-before (or release/acquire) relationship from
>         each write to each latter read of that volatile.
>
>Weak interpretation:
>         There is a happens-before (or release/acquire) relationship from
>         each write to each latter read of that volatile _that sees that
>         write_.
>
>Here is an example that shows the difference:
>
>Initially, x = y = v = 0.
>v is a volatile variable.
>
>
>Thread 1:
>r1 = x
>v = 0
>r2 = v
>y = 1
>
>Thread 2:
>r3 = y
>v = 0
>r4 = v
>x = 1
>
>Is the behavior r1 == r2 == 1 possible?
There must be some kind of typo here. v is never assigned any value except 
0, so r2 can never be 1.  Is the intent to ask whether r1==r3==1 is possible?
>Under the weak interpretation, each thread might see its own volatile write,
>in which case the volatile accesses would have no impact. In fact, under
>the weak semantics, the compiler could eliminate the volatile reads and
>transform this to:
>
>Thread 1:
>r1 = x
>v = 0
>r2 = 0
>y = 1
>
>Thread 2:
>r3 = y
>v = 0
>r4 = 0
>x = 1
>
>Under the strong semantics, this would not be possible, since we would be
>guaranteed an hb edge from the first volatile write to both volatile
>reads.
>
>The definition of Lazy Release Consistency uses the weak definition.
>We are investigating what the actual behavior of a DSM like Treadmarks 
>would be.
>
>
>
>Another place where this makes a difference: back in 2000, Rob Strom 
>wanted guidance
>on whether his optimistic readers transformation could be implemented in Java.
>The strong semantics makes this possible (see email below). I'm not sure it
>is possible to do the optimistic reads transformation with the weak semantics.
>
>Thoughts?
>
>         Bill
>
>----
>         From:     pugh@cs.umd.edu
>         Subject:        JavaMemoryModel: Roach motel volatile ordering 
> and the optimistic readers trans
>         Date:   July 6, 2001 3:25:17 PM EDT
>         To:       javamemorymodel@cs.umd.edu
>
>At 1:09 PM +1000 6/29/01, David Holmes wrote:
>>It is not clear to me that volatile and monitors should have the same
>>semantics when it comes to reorderings. The basic "roach motel" approach
>>with synchronized blocks works fine (so we've been persuaded :) ) because of
>>the added mutual exclusion. With volatiles there is no mutual exclusion and
>>I am concerned that these ordering rules may inhibit the use of volatile in
>>implementing various wait-free/non-blocking algorithms.
>>
>>As an example, the optimistic readers transform of Strom and Auerbach (ref
>>ECOOP 2001) requires insertion of a dummy volatile read after a volatile
>>write to prevent subsequent non-volatile writes from being performed prior
>>to the volatile write. If volatiles are to be useful as flags in
>>wait-free/non-blocking algorithms then it seems to me that ordering is
>>critical and so volatile accesses should act as code motion barriers in both
>
>
>The load/acquire and store/release semantics for synchronization memory 
>accesses has a long history in the architecture community. For example, see:
>
>K. Gharachorloo, D. Lenoski, J. Laudon, P. Gibbons, A. Gupta, and J.L. 
>Hennessy. Memory consistency and event ordering in scalable shared-memory 
>multiprocessors. In Proceedings of the 17th Annual International Symposium 
>on Computer Architecture, pages 15--26. IEEE, May 1990.
>
>Making volatile read/write be full two directional memory barriers will 
>likely at least double the cost of volatile reads on IA-64 and Alpha SMPs.
>
>Plus, you don't really need it. Here is an example of Rob's optimistic 
>readers idiom (from his paper):
>
>volatile int vno = 0;
>int a,b;
>synchronized void inc(int x) {
>   vno++;
>   a += x;
>   b += x;
>   vno++;
>   }
>
>/* unsynchronized */
>int prod() {
>   int v1 = vno; /* pre-inspect */
>   int t1 = a;
>   int t2 = b;
>   int v2 = vno; /* post-inspect */
>   if (v1 == v2 && v1%2 == 0)
>     return t1*t2; /* commit */
>   else ... /* abort */
>   }
>
>Rob's example only works if volatiles are 2-way memory barriers.
>Here is how you need to change it to make it work for volatiles that have 
>load.acquire and st.release semantics:
>
>
>volatile int vno = 0;
>volatile boolean force;
>int a,b;
>synchronized void inc(int x) {
>   int oldVersion = vno;
>   vno = oldVersion+1;
>   boolean ignore = force; /* Don't allow accesses to be hoisted */
>   a += x;
>   b += x;
>   vno = oldVersion+2;
>   }
>
>/* unsynchronized */
>int prod() {
>   int v1 = vno; /* pre-inspect */
>   int t1 = a;
>   int t2 = b;
>   force = false; /* force all accesses to be committed */
>   int v2 = vno; /* post-inspect */
>   if (v1 == v2 && v1%2 == 0)
>     return t1*t2; /* commit */
>   else ... /* abort */
>   }
>
>
>If the critical section of call to prod() completes before the critical 
>section of a call to inc(), then
>         the second read of vno in prod() occurs before
>           the first write to vno in inc()
>         therefore the write to force in prod() occurs before
>           the read of force in inc()
>         therefore all the memory accesses in the critical region of prod()
>           are ordered by the happens_before relation to be before the memory
>           accesses in the critical region of inc()
>
>If the critical section of call to inc() completes before the critical 
>section of a call to prod(), then
>         the second write to vno in inc() occurs before the first read of
>           vno in prod()
>         therefore all the memory accesses in the critical region of inc()
>           are ordered by the happens_before relation to be before the memory
>           accesses in the critical region of prod()
>
>Note that no one ever cares about the value stored into the force 
>variable. It is just used for establishing memory ordering.
>
>I am fairly confident that any situation that really depends on 2-way 
>memory barriers can be handled under acquire/release semantics be 
>inserting some additional dummy memory accesses. And this way, you will 
>only incur the costs of the additional memory/compiler barriers when you 
>really need them.
>
>For example, if you need a memory barrier:
>
>private volatile boolean force;
>
>final void memoryBarrier() {
>         force = false;
>         boolean ignore = force;
>         }
>
>Note that this memory barrier would only work between threads using the 
>same force field. If force were thread local, the compiler would be free 
>to optimize  away the memory barrier.
>
>I know that dummy volatile memory accesses are not intuitive. But this 
>isn't something that most people should be trying to do.
>
>         Bill
>
>
>-------------------------------
>JavaMemoryModel mailing list - http://www.cs.umd.edu/~pugh/java/memoryModel
>
>-------------------------------
>JavaMemoryModel mailing list - http://www.cs.umd.edu/~pugh/java/memoryModel
-------------------------------
JavaMemoryModel mailing list - http://www.cs.umd.edu/~pugh/java/memoryModel
This archive was generated by hypermail 2b29 : Thu Oct 13 2005 - 07:00:59 EDT