Reverse Engineering The First Stored Program
On 21^{st} June 1948, the Manchester Baby, or Small-Scale Experimental Machine (SSEM) ran the world’s first electronic stored program. While primarily an experimental testbed for testing the feasability of a new type of memory, it was working proof and validation of a computing principle that underpins all devices today.
But what exactly did this first stored program look like and how did it work? What kind of machine was the Manchester Baby, and how did it influence computer architecture?
Some pictures of a replica of the machine and original location can be found: Visiting The Manchester Baby.
First Stored Program Computer
Other early computers, such as the ENIAC (first operational 1945) can be considered reprogrammable to an extent. As a general purpose machine, it could run different programs in order to perform different tasks. However, getting the computer to perform a different task required physically rewiring in order to create a different program – a tedious task. The instructions of the program were encoded in the different interconnections.
In comparison, the program instructions for the Manchester Baby (1948) are stored in the electronically in the computer’s memory. This memory also serves as storage for the program data. It is for this reason, the Baby is recognised as the fully electronic stored program computer.
This concept was first described by scientist John von Neumann in 1945 in First Draft of a Report on the EDVAC. Today, most computers follow the same principle by storing and executing program instructions from memory.
Manchester Baby Design & Architecture
How data should be stored, and what memory should look like was becoming an issue in early computers. Up until then, early computers were using delay-line memories – data was stored as acoustic waves propagating through a column of mercury. This had several drawbacks, including being expensive, being sensitive to external factors, and not possible to access at arbitrary locations (randomly access).
The Manchester Baby was primarily an experimental machine in order to test a new form of memory in a computer – the Williams / Williams-Kilburn tube. It relied on Cathode Ray Tube (CRT) technology – the same as the tubes used in older TVs. Patterns of static charges could be stored using the display, and later sensed. Three of these tubes were used in the Baby:
- as 32 x 32-bit words of RAM (the store),
- to hold a 32-bit accumulator register,
- to hold the current instruction address and current instruction being executed
A fourth CRT could display the contents of any of these three tubes. The logic was implemented by 550 vacuum tubes. It operated at a clock rate of roughly ~100kHz, while consuming 3500W of power. The Baby had no I/O devices – programs were entered manually, with switches allowing the modification of bits within each 32-bit word.
Values are encoded using two’s complement. Unlike modern computers, bits are described from left-to-right (i.e. so the least-significant bit is on the left) – see opcodes below.
Instruction Set
The Baby had a simple instruction set of 7 instructions. Instructions were encoded in bits 13-15 of a word, with the operand in bits 0-12.
Binary Opcode | Notation | Modern Mneomic | Description |
---|---|---|---|
000 | S, Cl | JMP S | Absolute jump to memory address stored in memory location S |
100 | Add S, Cl | JRP S | Relative jump to current address + value at memory location S |
010 | -S, C | LDN S | Load into the accumulator the negated value stored at S |
110 | c, S | STO S | Store accumulator value at memory location S |
001, 101 | SUB S | SUB S | Subtract from the accumulator the value at memory location S |
011 | Test | CMP | Skip the next instruction if accumulator value is negative |
111 | Stop | STP | Halt the machine |
The First Stored Program: Analysis
The first program to run, the “Kilburn Highest Factor Routine” takes two positive numbers, a
and b
, and attempts to find the highest factor smaller or equal to b
that divides a
. On the original run, the program found the highest factor of 2^{18} (= 262,144), giving the result 131,072.
The program iterates over all numbers from b_{1} (given as input) downwards. With each potential divisor, b_{n}, an inner loop computes the remainder r_{n} by repeatedly subtracting b_{n} from a. The first value b_{n} where r_{n} is 0, indicates b_{n} is a factor of a and the program subsequently halts, and the resulting value of b_{n} can be read from the store. Because it searches from high to low, this first found factor must be the highest factor. Below is the listing and analysis of the ‘amended’ program shown above with both the original notation, and modern assembly-like mnemonics.
Line | Original | Mnemonic | Description |
---|---|---|---|
Program | |||
Initialisation | |||
1 | –24 to C | LDN 24 | Load input: –b_{1} |
2 | c to 26 | STO 26 | Store –b_{1} in S26 |
3 | –S26 to C | LDN S26 | Load b_{1} |
4 | c to 27 | STO 27 | Store b_{1} in S27 |
Calculate Remainder | |||
5 | –23 to C | LDN 23 | Load a |
6 | sub 27 | SUB 27 | Subtract b_{n} |
7 | test | CMP | Skip the next instruction if C is negative, continue from line 9 |
8 | add 20 to Cl | JRP 20 | (if C >= 0), then jump back S20 (= 3) number of instructions to line 6, to repeatedly subtract b_{n} |
9 | sub 26 | SUB 26 | The loop above will 'undershoot' by a single subtraction. Subtracting –b_{n} here is the same as adding b_{n}. The value of C will be 0 if b_{n} divides a, otherwise the positive remainder. |
Test Remainder | |||
10 | c to 25 | STO 25 | Store r_{n} in S25 |
11 | –25 to C | LDN 25 | Load –r_{n} |
12 | test | CMP | Skip the next instruction if C is negative, continue from line 14 |
13 | stop | HLT | (if C >= 0) then halt, the result b_{n} is in S27. Note that because of the way the remainder is calculated, the program will only halt if C is exactly 0, hence b_{n} divides a exactly |
Calculate next Divisor | |||
14 | –26 to C | LDN 26 | Load b_{n} |
15 | sub 21 | SUB 21 | Decrement b_{n} by one to generate b_{n+1} |
16 | c to 27 | STO 27 | Store b_{n+1} in S27 |
17 | –27 to C | LDN 27 | Load –b_{n+1} |
18 | c to 26 | STO 26 | Store –b_{n+1} in S26 |
19 | 22 to Cl | JMP 22 | Jump to line stored in S22 (= 4) to line 5 |
Constants | |||
20 | –3 | Relative jump offset used in inner loop, line 8 | |
21 | 1 | Used in arithmetic, decrement in line 15 | |
22 | 4 | Absolute jump offset used in outermost loop, line 19 | |
Inputs | |||
23 | –a | The negation of a, the number to find the highest factor of. On the first run of the program, this was 2^{18} = 262144, so –262144 was stored here | |
24 | b_{1} | The first value of b to test as a potential factor of a. This was originally (a – 1) = 262143. | |
Output & Intermediate Values | |||
25 | r_{n} | The working remainder is stored temporarily here, in order to negate for the CMP test. On termination, this value will be 0 | |
26 | –b_{n} | The negation of the working divisor currently being tested. Also the final solution negated on termination | |
27 | b_{n} | The divisor currently being tested. Also the final solution on termination | |
Unused | |||
28-32 | Unused |
Some of the choices have interesting motivations. For example, both an absolute and relative jump are used (when two of either would have been also sufficient) in order to have a program utilising every instruction (line 8 and line 19). –a is given as input and used as a workaround for lack of a non-negative load instruction on line 5. Similarly, both b_{n} and –b_{n} are maintained throughout the calculation in part to order to overcome the lack of an addition instruction. The positive b_{n} is used in line 6 in the repeated subtraction, and –b_{n} in order to add b_{n} by subtracting the negation on line 9.
It should be noted this is an amended version of the program dated a month after the original run in July 1948. This has a few small changes and optimisations from the original. How exactly this differs is discussed in The Original Original Program by Geoff Tootill, one of the scientists working on the Baby at the time, and from whose notebook the amended version was also documented in.
Working Backwards: Recreating the Program
Starting from an implementation in a modern higher-level language, could we create an equivalent program? Let’s try explore by working backwards and making modifications in stages. At each step we’ll annotate the statments that correspond with a Baby instruction. We’ll start with a similar algorithm that starts from b
descending, and tests every value to see if the remainder of a
divided by b
is 0.
As we know calculating remainders using a modulo operation directly won’t be possible, we’ll instead calculate the remainder by repeatedly subtracting b
in a loop. We’ll use c
as the accumulator:
int first(int a, int b) {
for (b; b >= 0; b--) {
int c = a;
while (c > 0) {
c -= b; // 6: SUB b
}
if (c == 0) {
return b; // 13: STP
}
}
}
c -= ?
is a useful form, as it corresponds directly to the SUB
instruction. Next, let’s unfold the loops into (guarded) goto statements (it does not matter that the first essentially becomes “do-while”). These will eventually translate to JMP
or JRP
instructions guarded by a CMP
. Additionally, we will decrement b
via the accumulator, as needs to be done on the Baby.
int second(int a, int b) {
int c;
L1:
c = a;
L2:
c -= b; // 6: SUB b
if (c > 0)
goto L2; // 8: JRP L2
if (c == 0)
return b; // 13: STP
c = b;
c -= 1;
b = c;
goto L1; // 19: JMP L1
}
The CMP
instruction only allows us to test if c
is negative – so c <= 0
needs to be changed. In order for the inner loop to work still, we can repeatedly subtract until c < 0
. This will result in subtracting b
one too many times but we can solve this by adding b
after the end of the loop. c
will then contain the remainder.
To test if the remainder is 0, we can store to a temporary location r
and reload the negation with LDN
(corresponds to the statement c = -?
). Any negative value once -r
is loaded in c
indicates b
does not divide a
. The only alternative is if -r = r = 0
, in which case we stop.
int third(int a, int b) {
int c = 0, r = 0;
L1:
c = a;
L2:
c -= b; // 6: SUB b
if (!(c < 0)) // 7: CMP
goto L2; // 8: JRP L2
c += b;
r = c; // 10: STO r
c = -r; // 11: LDN r
if (!(c < 0)) // 12: CMP
return b; // 13: STP
c = b;
c -= 1;
b = c;
goto L1; // 19: JMP L1
}
Statements in the form c = ?
are problematic as the load instruction, LDN
, loads the negation of a store location. If we introduce / or instead take the negation of a as input (na
), we could instead have c = -na
. Additionally, we cannot perform addition(!) so we will maintain a copy of negative b, nb
, so we can add b
to c
by doing instead c -= nb
.
int fourth(int na, int b) {
int c = 0, nb = 0, r = 0, ONE = 1;
c = -b; // 1: LDN b
nb = c; // 2: STO nb
L1:
c = -na; // 5: LDN na
L2:
c -= b; // 6: SUB b
if (!(c < 0)) // 7: CMP
goto L2; // 8: JRP L2
c -= nb; // 9: SUB nb
r = c; // 10: STO r
c = -r; // 11: LDN r
if (!(c < 0)) // 12: CMP
return b; // 13: STP
c = -nb; // 14: LDN nb
c -= ONE; // 15: SUB ONE
b = c; // 16: STO b
c = -b; // 17: LDN nb
nb = c; // 18: STO nb
goto L1; // 19: JMP L1
}
We have ended up with almost the same sequence of instructions as in Kilburn’s amended routine.
Legacy
Ultimately, the Baby was successful in demonstrating the use of Williams tubes as a form of computer memory, as well as the stored program concept. It led to the development of the more sophisticated Manchester Mark 1 (1948-9), and first commercial digital computer, the Ferranti Mark 1 (1951).
A nice downloadable simulator of the machine can be found at https://www.davidsharp.com/baby/.