Rob Pike on Concurrency and Message passing in Newsqueak

Patrick has the details.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

A Concurrent Window System

The paper referenced in the talk (if I remember correctly) is short and worth reading A Concurrent Window System.

interesting comment on the linked page

gvwilson said...

I used Occam (the transputer implementation of CSP) very heavily in the 1980s and early 1990s, and eventually started referring to channels as "the return of the GOTO", since in any moderately complex application, you spent a lot of time wondering, "If I put bytes in *here*, who will they go to?" Addressable actors and/or tuple spaces both felt much more scalable (in the coding sense).

CPS and channel passing concurrency

Yes, but the difference is that you can pass in where to goto. This is like a return address or a continuation. So the real issue was having to write in an explicit continuation passing style.

Communication in the actor

Communication in the actor model (in which actors are directly addressed, as opposed to channel-based communications) also resembles continuation passing style. Check out Carl Hewitt's 1976 paper "Viewing Control Structures as Patterns of Passing Messages" for lots of examples.

However, it seems to me that the criticism of channels in occam is exactly that the programmer cannot be sure "where to goto" -- with directly addressed actors, only the directly designated recipient can receive a message. With channels, if that channel has since been passed on to another component (or, in models that allow it, if multiple components are listening on a single channel), programs risk exactly the "return of the GOTO" effect described.

CSP library approaches

CSP Library implementations are more accessible for most of us - the Communicating Process Architectures 2007 conference included papers on Java, C++ and Python CSP libraries.

CSP for Java programmers, Part 1

CSP for Java programmers, Part 2

(Also iirc JCSP Networking Edition is due to be re-incorporated into GPL'd JCSP.)

Maybe wrapping the library in Scala would provide some fun with notation.

references in semantics?

The biggest question I had after watching this video were the "value semantics" Rob alludes to. Is there really no sharing? Does this mean that even a simple procedure call performs a deep copy of all arguments? I doubt it -- it would be pretty fundamentally at odds with procedural abstraction to incur such costs for a simple function call.

But more significantly for a concurrent language: how do you prevent sharing mutable storage between dynamically created processes in a lexically scoped language? For example:

x := ...
begin prog(){ ... x ... }
x = ...

Does begin cut off lexical scope? If so, that brings up questions of what it does have in scope. If not, then it also brings up even nastier issues: for example, what happens if the inner process refers to a variable in the outer process that has not yet been initialized? I.e., even if the data is not mutable in a single process, another process can try to read from it before its initializer finishes evaluating, which means you still have to implement it with a lock.

To put it another way, concurrency, like other control features, can expose underlying mutation in the model of a language even where there isn't explicit mutation. Another famous example is the exposure of letrec's statefulness by call/cc.

unanswered

The implementation paper discusses this a bit. I've not read it all yet.

There is a copy-on-write mechanism for sharing structures and a "rec" keyword for recursive functions if not structures.

The part I wonder about as well is that Newsqueak has variable assignment. So how does it treat closures used to spawn processes?

I thought Pike's statement that there are no references is curious: he apparently means no mutable cells within structures, but references from variables to values.

a clue

This brief reference manual contains a hint in the last section ("implementation bugs"):

The current implementation does not allow local variables to exist past the lifetime of the prog in which they were created. ... For example,

bad1:=prog() {
    a:int;
    p:=prog(){
        a=1;
    };
};

draws an error. To mitigate the problem somewhat, squint internally rewrites the program

bad2:=prog(){
    a:int;
    begin prog(){
        a=1;
    }();
};

into the form

bad2:=prog(){
    a:int;
    begin prog(a:int){
        a=1;
    }(a);
};

... local progs not begun as processes must access only globals and variables local to themselves.

So at least at the time of the writing of that manual (April 1994), the sequential portion of the language didn't support closures, but spawned functions were lambda-lifted.

Notice that if there are recursive data structures, you could bind a recursive binding to an initialization expression that spawns a process which reads from that recursive binding, and with the closure conversion described above, the unspecified uninitialized value would be sent to the spawned process.

If recursion is only allowed via a limited function declaration syntax, I don't think there's a bug.

I still think this is one of the more challenging aspects of the design of value-oriented, concurrent languages.

http://go/squintingatpowerseries

OK, stupid question. What is the full domain name of the "go" in the doc refs? I know it's not go.com, go.bell-labs.com, go.google.com, or just shorthand for google.com. Great talk BTW!

internal google site

I'm assuming it is an address inside the google firewall.

Yes, but it redirects to...

Yes it is. Fortunately, it redirects to Squinting at Power Series which should be publicly available.

Lots of room for future work...

I was in a corporate lab 15 years ago that built a similar system. We designed very large real-time systems in a language based on CSP concurrency. We then generated real-time code for a variety of hardware systems, from giant servers to smartcards. The interesting part is eliminating as many processes and channels as you can by doing something like CPS analysis. You might be relying on some of our code right now... ;-)