Showing posts with label computer science. Show all posts
Showing posts with label computer science. Show all posts

5 August 2025

Universal Turing Machine

Turing Machine in Bletchley Park (licensed CC BY-NC-ND by Douglas Hoyt)The more I practice, the more I explore extreme situations. I have tried to go faster. Other times I Programmed with Nothing. Maybe it is the difficulty of only using low level constructs that makes these group of coding constraints appealing to me - which makes coding in assembly still interesting to some people, too. Then I tried to create Fizz Buzz using a Turing machine. I guess I lacked the purity of Computer Science university exercises. I will dive into Turing machines from a practical coding perspective using Java here.

Turing Machine
What is a Turing machine (TM)? First the TM is a though experiment. It was not meant to be practical. It is "the thing with the tape where a read-write head is able to read a zero or a one, write a zero or a one, erase the value, and move left or right." It is supposed to be Turing complete, which means it is equivalent to all programming languages today. That was my understanding of a TM when I started. But that understanding was impractical. Is a TM a program or a set of programs or a programming language? How can it be a specific program and Turing complete at the same time? And how could I implement anything like that?

Basic (Java) Building Blocks
First I need the building blocks. There is a set of symbols, maybe only 0 or 1 (a bit) or maybe a character to make things easier. So an empty marker interface Symbol should do. (The interface is optional, I like to make the symbol different from plain Object. I could use a plain char as well.) Then there is the tape which can be read and written to. The Tape class represents the tape as a data structure, such as an array or a list. Each cell of the tape holds a symbol.
class Tape<SYM extends Symbol> {

  private Map<Integer, SYM> cells = new TreeMap<>();
  private SYM defaultSymbol;
  private int headPosition = 0;

  Tape(List<SYM> symbols, SYM defaultValue) {
    for (int i = 0; i < symbols.size(); i++) {
      this.cells.put(i, symbols.get(i));
    }
    this.defaultSymbol = defaultValue;
  }

  SYM read() {
    return cells.getOrDefault(headPosition, defaultSymbol);
  }

  void write(SYM symbol) {
    cells.put(headPosition, symbol);
  }

  void moveHead(Direction direction) {
    switch (direction) {
    case LEFT:
      headPosition--;
      break;
    case NONE:
      break;
    case RIGHT:
      headPosition++;
      break;
    default:
      throw new IllegalArgumentException(direction.name());
    }
  }
}
The tape could be infinite, and I chose a Map for each symbol by its position. The default symbol is used for tape positions that have not been written to. The constructor accepts the initial content of the tape as the starting state of the machine. Now to the Turing machine:
class TuringMachine<SYM extends Symbol> {

  private Tape<SYM> tape;
  private State state;

  TuringMachine(Tape<SYM> tape, State initialState) {
    this.tape = tape;
    this.state = initialState;
  }

  private void setState(State state) {
    this.state = state;
  }

  private SYM read() {
    return tape.read();
  }

  private void write(SYM symbol) {
    tape.write(symbol);
  }

  private void move(Direction direction) {
    tape.moveHead(direction);
  }

  void loop() {
    while (!state.isTerminal()) {
      act();
    }
  }

  // ...
Following the description of a Turing machine, it has a state which can change. It can read the symbol currently present on the tape at the position of its read/write head. It can write a new symbol on the tape at the position of its read/write head, replacing the symbol that was previously there. And it can move its read/write head one position to the left or right along the tape. The machine runs as long as its state is not final. In each step (method act shown below) the machine reads the symbol present in the tape's cell, and based on the current state and the read symbol, it determines the next state and the symbol to write on the tape, then it moves the head left or right by one cell.
class TuringMachine<SYM extends Symbol> {

  // ...

  private void act() {
    SYM symbol = read();

    // transition to nextState determines
    // symbolToWrite and directionToMove

    setState(nextState);
    write(symbolToWrite);
    move(directionToMove);
  }

}
Of course this was not my initial code. I reached it by refactoring, cleaning up and structuring the code again and again. If this is a Turing machine, where is the actual program? The Tape and TuringMachine classes form a Universal Turing Machine (UTM). The UTM can simulate the behaviour of any Turing machine. It takes the description of any arbitrary Turing machine as input and mimics its operation on a given input string. Aha! My infrastructure can run an arbitrary Turing machine which makes each TM an individual program and the UTM the interpreter. Maybe I should have read the whole article on Wikipedia first.

Turing machine (licensed CC BY by Maria Keays)State Transitions
The actual program of a Turing machine is the set of its states and possible state transitions. It is a state machine, i.e. a graph of state nodes, and each edge between two states contains additional data: The Symbol symbolToWrite and the Direction directionToMove. In text, the easiest way to define this state machine is with a transition table of State state, Symbol symbol, State newState, Symbol newSymbol, Direction direction which is a TransitionTableRow in my TransitionTable implementation. (The whole transition logic is too much code to show here, the working solution is on my GitHub.) As I will create many transitions, I add defaults to the tables, e.g. state == null applies to all states, symbol == null applies to all symbols and newState == null or newSymbol == null does not change the current state or symbol. I want to avoid defining more table rows than strictly necessary.

A Simple Example
Now is the perfect time for an example. We want to set all bits of a binary number, e.g. 011010. The symbols are the bits 0 and 1 and a marker for the end of the input,
enum Bit implements Symbol {
  _0, _1, END_OF_TAPE
}
Starting at the left-most / highest bit. The transitions are
  • If the machine is running and the current bit is 0, then keep running, write a 1 and move right.
  • If the machine is running and the current bit is 1, then keep running, write a 1 and move right.
  • If the machine is running and the input has ended, then stop running (halt).
This is a simple machine and there are only two states: running and stopped,
enum S implements State {
  RUNNING, HALT;

  @Override
  public boolean isTerminal() { return this == HALT; }
}
and the whole things runs in this test,
class SimpleExampleTest {

  @Test
  void replaces0With1() {
    S initialState = S.RUNNING;
    var _011010 = Arrays.asList(_0, _1, _1, _0, _1, _0, END_OF_TAPE);
    var tape = new Tape<>(_011010, END_OF_TAPE);

    TransitionTable transitions = new TransitionTable().
        row(S.RUNNING, _0, null, _1, Direction.RIGHT).
        row(S.RUNNING, _1, null, null, Direction.RIGHT).
        row(S.RUNNING, END_OF_TAPE, S.HALT, null, Direction.NONE);

    var machine = new TuringMachine<>(tape, transitions, initialState);

    machine.loop();

    var expected_111111 = Arrays.asList(_1, _1, _1, _1, _1, _1, END_OF_TAPE);
    var actualResult = tape.getCells();
    assertEquals(expected_111111, actualResult);
  }
}
Now I know what a Turing machine is and how it is supposed to work. Using the Constructive Approach I created an UTM (TuringMachine.java) and an instance of a TM (in SimpleExampleTest.java). Now I can play around and see what is possible using only state transitions... I encourage you to do the same, clone my fizzbuzz-turing-machine repository and play around with the code in the universal package. This concludes the first part - the foundation if you like - of me implementing Fizz Buzz using a Turing machine.

21 June 2020

Conway's Squasher Coroutine

Since some time I am researching coroutines. Wikipedia says that a coroutine is a computer program component that generalises subroutines, allowing execution to be suspended and resumed. [...] Subroutines are special cases of coroutines. When subroutines are invoked, execution begins at the start, and once a subroutine exits, it is finished; an instance of a subroutine only returns once, and does not hold state between invocations. By contrast, coroutines can exit at any time, and execution may later return to these points. [...] A coroutine instance holds state, and varies between invocations. Coroutines are useful for many things. For example the Kotlin programming language lists several uses cases including asynchronous computations, generators and implementing state machines in a sequential way. More and more languages provide them.

The Paper
According to Donald Knuth, Melvin E. Conway coined the term coroutine in 1958 when he applied it to the construction of an assembly program. He first published it 1963 in his paper Design of separable transition-diagram compiler. Conway states that the coroutine idea was concurrently developed by him and by Joel Erdwinn. The paper is about implementing a COBOL compiler, the first two pages introduce the coroutine together with its implementation in assembly for the Burroughs model 220, a late vacuum-tube computer.

Code Example
Conway shows an example, the asterisk squasher subroutine, in both subroutine and coroutine form. The mentioned assembly commands are unknown to me and the flowcharts are not 100% readable, so I decided to reimplement the example in modern assembly to see what is going on. The asterisk squasher is a program which reads cards and writes the string of characters it finds, column 1 of card 1 to column 80 of card 1, then column 1 of card 2, and so on, with the following wrinkle: every time there are adjacent asterisks they will be paired off from the left and each "**" will be replaced by the single character "^". This operation is done with the exponentiation operator of FORTRAN and COBOL.

Asterisk Squasher Subroutine
The problem is that out - the output area of the SQUASHER routine - only holds a single byte, which means the routine can only return and subsequentially write a single character. When looking for the second asterisk, the routine reads a second element and thus needs somehow to return it on the next call unless it was a "*". Conway continues that the simple requirement of compressing asterisks requires the introduction of a switch which bifurcates the subroutine and selects the part to be used at each call depending on the situation at the last call. The reason for the switch is that each call of the subroutine must result in output of exactly one character. The second read character is stored in a (module) global variable t2 until the next call. Here is its flowchart:

Figure 1: Asterisk squasher subroutine with environment
I use my trusted Windows assembly setup for Intel IA-32 assembly. The assembly is in NASM format. First define the data:
card_len equ    80

switch: resd    1
%define ON      dword 1
%define OFF     dword 0

i:      resd    1
card:   resb    card_len

t1:     resd    1
t2:     resd    1

out:    resd    1
Let's warm up with the code for RDCRD, which reads 80 characters of a card. The code strictly follows the flowchart above.
RDCRD:
        mov     eax, [i]
        cmp     eax, card_len
        jne     .exit

        mov     [i], dword 0

        ; Omitted using Windows system library functions
        ; _GetStdHandle@4 and _ReadFile@20 to read 80
        ; characters into [card].

.exit:
        ret
The main method is SQUASHER with the switch variable.
SQUASHER:
        mov     eax, [switch]
        cmp     eax, OFF
        je      .off
.on:
        mov     eax, [t2]
        mov     [out], eax

        mov     [switch], OFF
        jmp     .exit

.off:
        call    RDCRD

        mov     esi, [i]
        xor     eax, eax
        mov     al, [card + esi]
        mov     [t1], eax

        inc     esi
        mov     [i], esi

        mov     eax, [t1]               ; redundant, value still in register
        cmp     eax, '*'
        jne     .not_equal_ast

.equal_ast:
        call    RDCRD

        mov     esi, [i]                ; redundant, value still in register
        xor     eax, eax
        mov     al, [card + esi]
        mov     [t2], eax

        inc     esi
        mov     [i], esi

        mov     eax, [t2]               ; redundant, value still in register
        cmp     eax, '*'
        jne     .not_equal_second_ast

.equal_second_ast:
        mov     [t1], dword '^'
        jmp     .not_equal_ast

.not_equal_second_ast:
        mov     [switch], ON
        jmp     .not_equal_ast

.not_equal_ast:                         ; label 1
        mov     eax, [t1]
        mov     [out], eax

.exit:
        ret
There is nothing special happening in the code above. There are the two main branches .on and .off. To run the whole program on a single test card I need a WRITE and a main method. WRITE iterates a card and prints the result of each call to SQUASHER to standard out.
WRITE:
.loop:
        call    SQUASHER

        ; Omitted using Windows system library functions
        ; _GetStdHandle@4 and _WriteFile@20 to write 1
        ; character from [out].

        mov     eax, [i]
        cmp     eax, card_len
        jne     .loop

        ret

; ----------------------------------------
        global  _main
_main:
        ; set up switch
        mov     [switch], OFF

        ; set up global data
        mov     [i], dword card_len

        call    WRITE

        push    0
        call    _ExitProcess@4
Asterisk Squasher as Coroutine
Conway writes further in his paper: The coroutine approach to the same problem accomplishes the switching job implicitly by use of the subroutine calling sequence. When coroutines A and B are connected so that A sends items to B, B runs for a while until it encounters a read command, which means it needs something from A. Then control is transferred to A until it wants to "write," whereupon control is returned to B at the point where it left off. In this situation A is the SQUASHER and B is the WRITE coroutine. The following flowchart shows SQUASHER when both it and the using program (WRITE) are coroutines.

Figure 2: Asterisk squasher as coroutine
Coroutine in Assembly?
While some architectures provide support for coroutines, according to the Art of Assembly Language 80x86 CPUs do not provide a COCALL instruction. Conway describes the coroutine linkage on the Burroughs 220 as follows: The UNCONDITIONAL BRANCH instruction BUN A works by placing its address A into the P-register (the Instruction Pointer). The STORE P instruction STP B places the contents of P plus one into the address part of the contents of location B. The standard subroutine call is
1:      STP EXIT                        ; store 3 into target address of address EXIT
2:      BUN ENTRANCE                    ; jump to address ENTRANCE
3:
where location EXIT contains a BUN instruction whose address will be altered by the STP instruction in the call. (So this is self-modifying code and the return of the subroutine is then EXIT: BUN 3 or something like that.) A pair of subroutines becomes a pair of coroutines by adding to each an isolated BUN instruction which we can call its router, and by changing the addresses of the STP and BUN calls as follows: when coroutine A calls coroutine B the call is
        STP AROUTER
        BUN BROUTER
That seems a small change for the Burroughs but how could I recreate that?

Coroutine in Assembly!
The BUN is a JMP on Intel. Unfortunately not all instructions are one byte, so using the current IP + 1 as done by STP does not work. I tried to follow the original idea with a short macro. When coroutine A calls coroutine B the call is coro A B,
%macro coro 2
        mov     ebx, %%_cnt             ; basically IP + 1
        mov     [route_%1], ebx         ; store it in target of router A
        jmp     %2                      ; jump to router B
%%_cnt: nop
%endmacro

route_SQUASHER: resd 1
route_WRITE:    resd 1

SQUASHER:
        jmp     [route_SQUASHER]

WRITE:
        jmp     [route_WRITE]
This needs storage for the route of each coroutine. The call of the macro does not look so good as it needs the name of the current coroutine as first argument for the macro to use the right route storage. If each coroutine would be its own module, I could drop that and just use [route]. The router variables [route_*] will be initialised at startup. The code of the two routines SQUASHER and WRITE changes slightly, strictly following the second flowchart above.
; router
SQUASHER:
        jmp     [route_SQUASHER]

; coroutine
SQUASHER_CORO:                          ; label 1
        call    RDCRD

        mov     esi, [i]
        xor     eax, eax
        mov     al, [card + esi]
        mov     [t1], eax

        inc     esi
        mov     [i], esi

        mov     eax, [t1]               ; redundant, value still in register
        cmp     eax, '*'
        jne     .not_equal_ast

.equal_ast:
        call    RDCRD

        mov     esi, [i]                ; redundant, value still in register
        xor     eax, eax
        mov     al, [card + esi]
        mov     [t2], eax

        inc     esi
        mov     [i], esi

        mov     eax, [t2]               ; redundant, value still in register
        cmp     eax, '*'
        jne     .not_equal_second_ast

.equal_second_ast:
        mov     [t1], dword '^'
        jmp     .not_equal_ast

.not_equal_second_ast:
        mov     eax, [t1]
        mov     [out], eax
        coro    SQUASHER, WRITE

        mov     eax, [t2]
        mov     [out], eax
        coro    SQUASHER, WRITE

        jmp     SQUASHER_CORO

.not_equal_ast:                         ; label 2
        mov     eax, [t1]
        mov     [out], eax
        coro    SQUASHER, WRITE

        jmp     SQUASHER_CORO

; ----------------------------------------
; router
WRITE:
        jmp     [route_WRITE]

; coroutine
WRITE_CORO:
.loop:
        coro    WRITE, SQUASHER

        ; Omitted using Windows system library functions
        ; _GetStdHandle@4 and _WriteFile@20 to write 1
        ; character from [out].

        mov     eax, [i]
        cmp     eax, card_len
        jne     .loop

        ret

; ----------------------------------------
        global  _main
_main:
        ; set up coroutine routers
        mov     ebx, SQUASHER_CORO
        mov     [route_SQUASHER], ebx

        mov     ebx, WRITE_CORO
        mov     [route_WRITE], ebx

        ; set up global data
        mov     [i], dword card_len

        call    WRITE

        push    0
        call    _ExitProcess@4

Conslusion
I hope that clarifies how the first coroutine presented in 1963 worked. Re-coding the example was a lot of work and in the end it paid off because everything is clear now, even more so after I wrote this blog post about it. This is the golden triangle of learning: theory (e.g. reading about it), practice (e.g. creating it) and finally teaching (documenting and sharing it).

Full Source Code
After I published this article Dmitry Kandalov pushed me to make the code available on GitHub. And then we went to port it to macOS, so he could play with it as well. To compile and run the code you need GNU make, NASM and a linker depending to your platform. Thanks to Samir Talwar's Smoke, it even comes with tests. Give it a try: Conway's Squasher Coroutine @ GitHub.