Multiprogramming: from Symmetries to Cases

Dated: 

early 1963

Dijkstra tackled the problem of allowing multiple users share the university's X8 computer and its peripheral hardware devices (EWD 51). To solve this problem, Dijkstra first reformulated it in terms of as many symmetries as he could find, thereby obtaining a more general problem description. Afterwards, he introduced case distinctions in a controlled manner, gradually refining the general problem into an X8-specific problem (by resorting to flip-flops, compact storage techniques, and what we today call caching).

Symmetries

Starting with Dijkstra's generalizations, here are four ways in which he transformed the initial problem into a more symmetric one:

  1. Dijkstra removed the distinction between a user's program and a peripheral hardware device by viewing both as a process (— each of which shares the memory of the X8).
  2. Dijkstra removed the distinction between a non-sequential and a sequential process. A non-sequential process can be redefined in terms of two or more sequential processes (that communicate with each other via the shared memory). Therefore, Dijkstra only considered sequential processes. 
  3. Dijkstra removed the distinction between terminating and cyclic processes. A terminating process can be redefined as a cyclic process. Therefore, Dijkstra only considered cyclic processes. 
  4. The number of processes can fluctuate: processes can suddenly appear and disappear. A fixed number is, after all, a special case of a changing number; whence Dijkstra's preference for the latter. 

Dijkstra subsequently used the word “machine” instead of “sequential cyclic process” in the rest of his exposition. The general problem thus amounted to: allowing a changing number of machines to communicate with each other in an orderly fashion by means of a shared memory.

Afterwards, Dijkstra introduced the notions “semaphore”, “V operation”, “P operation”, “atomic action”, etc. — concepts that I take for granted here — and illustrated how they can be used to synchronize two or more “communicating machines”. (Today we would rather use the words: “communicating processes”.)

Cases

After symmetry and machine independence comes asymmetry. In Dijkstra's words:

Once we take the limited hardware facilities of the university's computer system into account we need to distinguish between `concrete machines' and `abstract machines'. [My paraphrased translation from Dutch, EWD 51]

Using modern terminology:

  • Dijkstra's “concrete machine” corresponds to a hardware task, and
  • Dijkstra's “abstract machine” corresponds to a software task.

Note thus that a “concrete machine” is not a refinement of an “abstract machine”. A “concrete machine” and an “abstract machine” are both refinements of a “machine”. (The concrete machine typically resembled a peripheral hardware device of the X8 computer.)

Dijkstra did not want to distinguish between slow abstract machines and fast abstract machines. Each abstract machine, Dijkstra emphasized, is either working or halting — and that is thus the only distinction we can make about any two abstract machines. (I believe Dijkstra made the same generalization for concrete machines in one of his later writings.)

Dijkstra then focused on the semaphore that stands in between a concrete machine and an abstract machine. The semaphore has an integer value t which we will store in the shared memory at a fixed memory location — a location that is known to both the concrete and the abstract machine. Again, in the interest of generality, Dijkstra noted that t can be positive, zero, or even negative. For as long as there is no practical reason to constrain t's range, Dijkstra stressed, we should keep it as broad as possible.

There is a complication, however. An integer t stored in memory is a passive entity and a semaphore has to be active in that it should inform its pending machines (if any) whenever its integer becomes positive. Therefore, Dijkstra said, we will add a flip-flop to the semaphore. Dijkstra denoted the value of the flip-flop — which can be either 0 or 1 — by T.

Let change_t and change_T denote change of the semaphore's integer t and flip-flop T, respectively. Dijkstra remarked that there is two ways to ensure that change_t and change_T occur in an orderly fashion, so that the semaphore's t and T values remain in sync with each other. One way (A) is to define the combination of change_t and change_T as one atomic action. Another way (B) is to retain change_t and change_T as separate actions but to guarantee that any temporal discrepancy between t and T does not cause any problems.

Whenever the concrete machine modifies the semaphore we will go for (A), Dijkstra remarked. The choice for (A) is natural here because the concrete machine (hardware) will have a higher priority than the abstract machine (software) when demanding control of the shared memory. Whenever the abstract machine modifies the semaphore we will go for (B).

The asymmetry between concrete and abstract machines thus gradually entered the picture, leading Dijkstra to introduce two case distinctions:

  1. The concrete machine invokes the P operation and the abstract machine invokes the V operation.
  2. The concrete machine invokes the V operation and the abstract machine invokes the P operation.

I will briefly discuss the first case distinction.

The P operation, implemented on the concrete machine (hardware), amounts to:

if T=1 then begin t:=t-1; if t≤0 then T:=0 end.

The V operation, implemented on the abstract machine (software), amounts to:

“t:=t+1;
if T=0 then
     begin if 0<t
            then T:=1 end

Dijkstra explained why these code snippets do the job, an explanation that I omit here.

“In practice”, Dijkstra continued, “the X8 forces us to deal with more complexities”. In the interest of reducing the communication between the abstract machine on the X8 and the concrete machine on a peripheral device, we will store the flip-flop T in the peripheral device. This design choice makes the query T=1 and the assignment T:=0 less costly. On the other hand, it makes the query T=0 more costly. Therefore, we will also cache a copy of T in the main memory, denoted by ST.

These considerations led Dijkstra to present refined code snippets of the P and V operations. The P operation on the concrete machine becomes:

if T=1 then begin t:=t-1; if t≤0 then ST:=T:=0 end.

The V operation on the abstract machine becomes:
“t:=t+1;
if ST=0 then
     begin if 0<t then
            begin ST:=1;
                       T:=1
            end
end

Dijkstra took more machine-specific concerns into account: he placed t and ST in the same word in main memory. This storage compaction led to refined code snippets (not shown here).

In summary, then, Dijkstra first reasoned machine independently and as symmetrically as possible. He gradually took machine-specific concerns into account, breaking symmetries only when compelled to do so. At the end of the day, he had a solution that was both simple and practical. Contrast this, then, with my previous observation that Dijkstra was not perceived as an engineer by some of his colleagues.

Tags: