# Specifying Concurrent Systems with TLA ${ }^{+}$ 

Leslie Lamport<br>Compaq

23 April 1999

Appeared in Calculational System Design. M. Broy and R. Steinbrüggen, editors. IOS Press, Amsterdam, (1999), 183-247.

## Contents

1 Introduction ..... 183
2 A Little Simple Math ..... 184
2.1 Propositional Logic ..... 184
2.2 Sets ..... 186
2.3 Predicate Logic ..... 186
3 Specifying a Simple Clock ..... 188
3.1 Behaviors ..... 188
3.2 An Hour Clock ..... 188
3.3 A Closer Look at the Hour-Clock Specification ..... 190
3.4 The Hour-Clock Specification in TLA ${ }^{+}$ ..... 190
3.5 Another Way to Specify the Hour Clock ..... 192
4 An Asynchronous Interface ..... 193
4.1 The First Specification ..... 194
4.2 Another Specification ..... 196
4.3 Types: A Reminder ..... 198
4.4 Definitions ..... 198
4.5 Comments ..... 199
5 A FIFO ..... 201
5.1 The Inner Specification ..... 202
5.2 Instantiation Examined ..... 203
5.2.1 Parameterized Instantiation ..... 203
5.2.2 Implicit Substitutions ..... 205
5.2.3 Instantiation Without Renaming ..... 205
5.3 Hiding the Queue ..... 205
5.4 A Bounded FIFO ..... 206
5.5 What We're Specifying ..... 207
6 A Caching Memory ..... 208
6.1 The Memory Interface ..... 208
6.2 Functions ..... 211
6.3 A Linearizable Memory Specification ..... 212
6.4 Tuples as Functions ..... 213
6.5 Recursive Function Definitions ..... 213
6.6 A Write-Through Cache ..... 215
6.7 Invariance ..... 219
6.8 Proving Implementation ..... 220
7 Some More Math ..... 222
7.1 Sets ..... 222
7.2 Silliness ..... 223
7.3 Recursive Function Definitions Revisited ..... 224
7.4 Functions versus Operators ..... 225
7.5 Using Functions ..... 227
7.6 Choose ..... 227
8 Liveness and Fairness ..... 228
8.1 Temporal Formulas ..... 228
8.2 Weak Fairness ..... 231
8.3 Liveness for the Memory Specification ..... 233
8.4 Strong Fairness ..... 234
8.5 Liveness for the Write-Through Cache ..... 235
8.6 Quantification ..... 237
8.7 Temporal Logic Examined ..... 238
8.7.1 A Review ..... 238
8.7.2 Machine Closure ..... 238
8.7.3 The Unimportance of Liveness ..... 239
8.7.4 Temporal Logic Considered Confusing ..... 239
9 Writing a Specification-Some Advice ..... 240
References ..... 244

# Specifying Concurrent Systems with TLA ${ }^{+}$ 

Leslie LAMPORT<br>Compaq

## 1 Introduction

Writing a specification for a system helps us understand it. It's a good idea to understand something before building it, so it's a good idea to specify a system before implementing it. Specifications written in an imprecise language like English are usually imprecise. In engineering, imprecision is an invitation to error. Science and engineering have adopted mathematics as a language for writing precise descriptions.

The mathematics written by most mathematicians and scientists is still imprecise. Most mathematics texts are precise in the small, but imprecise in the large. Each equation is a precise assertion, but you have to read the text to understand how the equations relate to one another and what the theorems really mean. Logicians have developed ways of eliminating the words and formalizing mathematics. Most mathematicians and computer scientists think that writing mathematics formally, without words, is tiresome. Once you learn how, I hope you'll find that it's easy to express ordinary mathematics in a precise, completely formal language.

To specify systems with mathematics, we must decide what kind of mathematics to use. We can specify an ordinary sequential program by describing its output as a function of its input. So, sequential programs can be specified in terms of functions. Concurrent systems are usually described in terms of their behaviors-what they do in the course of an execution. In 1977, Amir Pnueli introduced the use of temporal logic for describing such behaviors [12].

Temporal logic is appealing because, in principle, it allows a concurrent system to be described by a single formula. In practice, temporal logic proved to be cumbersome. Pnueli's temporal logic was ideal for describing some properties of systems, but awkward for others. So, it was usually combined with some more traditional way of describing systems.

In the late 1980's, I discovered TLA, the Temporal Logic of Actions. TLA is a simple variant of Pnueli's original logic that makes it practical to write a specification as a single formula. Most of a TLA specification consists of ordinary, nontemporal mathematics. Temporal logic plays a significant role only in describing those properties that it's good at describing. TLA also provides a nice way to formalize the style of reasoning about systems that has proved to be most effective in practice - a style known as assertional reasoning. However, the topic of this chapter is specification, not proof, so I will have little to say about proofs.

TLA provides a mathematical foundation for describing concurrent systems. To write specifications, we need a complete language built atop that foundation. Although mathematicians have developed the science of writing formulas, they haven't turned that science into an engineering discipline. They have developed notations for mathematics in the small, but not for mathematics in the large. The specification of a real system can be dozens or even hundreds of pages long. Mathematicians know how to write 20 -line formulas, not 20 page formulas. So, I had to introduce notations for writing long formulas. I also took from programming languages some ideas for modularizing large specifications.

The language I came up with is called TLA ${ }^{+}$. I refined TLA ${ }^{+}$in the course of writing specifications of disparate systems. But, it has changed little in the last few years. I have found TLA ${ }^{+}$to be quite good for specifying a wide class of systems-from program interfaces
(api's) to distributed systems. It can be used to write a precise, formal description of almost any sort of discrete system. It's especially well suited to describing asynchronous systemsthat is, systems with components that do not operate in strict lock-step.

One advantage of a precise specification language is that it enables us to build tools that can help us write correct specifications. Among the tools currently under development are a parser and a model checker. The parser can catch simple errors in any TLA ${ }^{+}$specification. The model checker can catch many more errors, but it will work on a restricted class of specifications - a class that should include a large fraction of the specifications of interest to industry today. These tools will be described elsewhere; see the TLA Web page for the latest information [7].

A system specification consists of a lot of ordinary mathematics glued together with a little bit of temporal logic. So, most of the work in writing a precise specification consists of expressing ordinary mathematics precisely. That's why most of the details of TLA ${ }^{+}$are concerned with ordinary mathematics.

Unfortunately, the computer science departments in many universities apparently believe that fluency in C++ is more important than a sound education in elementary mathematics. So, some readers may be unfamiliar with the mathematics needed to write specifications. Fortunately, this mathematics is quite simple. If overexposure to $\mathrm{C}++$ hasn't destroyed your ability to think logically, you should have no trouble filling any gaps in your mathematics education. You probably learned arithmetic before being exposed to C++, so I will assume you know about numbers and arithmetic operations on them. ${ }^{2}$ I will try to explain all other mathematical concepts that you need, starting in Section 2 with a review of some elementary math. I hope most readers will find this review completely unnecessary. However, you should at least glance at it, since it introduces some notation that is used later.

After the brief review of simple mathematics in the next section, Sections 3 through 6 describe TLA ${ }^{+}$with a sequence of examples. Section 7 explains some more about the math used in writing specifications. By the time you finish Section 7, you should be able to specify the class of properties known as safety properties. Specifying safety properties requires almost no temporal logic. Temporal logic comes to the fore in Section 8, where it is used to specify the additional class of properties known as liveness properties. In practice, specifications are written to help detect errors - hopefully, before the system is built. Experience indicates that specifying safety properties catches many more errors than specifying liveness properties. So, you may decide not to bother reading Section 8. Finally, Section 9 reviews what we've done and provides some practical advice. At the end of the chapter is some useful reference material: Figures 12 and 13 list all the built-in operators of TLA ${ }^{+}$, Figure 14 lists the user-definable operator symbols, and Figure 15 shows the ASCII representations of all TLA ${ }^{+}$symbols.

## 2 A Little Simple Math

### 2.1 Propositional Logic

Elementary algebra is the mathematics of real numbers and the operators,,$+- *$ (multiplication), and / (division). Propositional logic is the mathematics of the two Boolean values TRUE and FALSE and the five operators whose names (and common pronunciations) are:

```
 conjunction (and) 
\checkmark ~ \text { disjunction (or) } \quad \Rightarrow \text { implication (implies)}
 equivalence (is equivalent to)
```

To learn how to compute with numbers, you had to memorize addition and multiplication tables and algorithms for calculating with multidigit numbers. Propositional logic is much

[^0]simpler, since there are only two values, TRUE and false. To learn how to compute with these values, all you need to know are the following definitions of the five Boolean operators:
\[

$$
\begin{aligned}
& \wedge F \wedge G \text { equals true iff }{ }^{3} \text { both } F \text { and } G \text { equal true. } \\
& \vee F \vee G \text { equals true iff } F \text { or } G \text { equals true (or both do). }{ }^{4} \\
& \neg \neg F \text { equals true iff } F \text { equals False. } \\
& \Rightarrow F \Rightarrow G \text { equals True iff } F \text { equals FalSe or } G \text { equals True (or both). } \\
& \equiv F \equiv G \text { equals True iff } F \text { and } G \text { both equal True or both equal FALSE. }
\end{aligned}
$$
\]

We can also describe these operators by truth tables. This truth table for $F \Rightarrow G$ gives its value for all four combinations of truth values of $F$ and $G$ :

| $F$ | $G$ | $F \Rightarrow G$ |
| :--- | :--- | :--- |
| TRUE | TRUE | TRUE |
| TRUE | FALSE | FALSE |
| FALSE | TRUE | TRUE |
| FALSE | FALSE | TRUE |

People are often confused about why $\Rightarrow$ means implication. In particular, they don't understand why FALSE $\Rightarrow$ TRUE and FALSE $\Rightarrow$ FALSE should equal TRUE. The explanation is simple. We expect that if $n$ is greater than 3 then it should be greater than 1 , so $n>3$ should imply $n>1$. Substituting 4, 2 , and 0 for $n$ in the formula $(n>3) \Rightarrow(n>0)$ explains why we can $\operatorname{read} F \Rightarrow G$ as $F$ implies $G$ or, equivalently, as if $F$ then $G$.

The equivalence operator $\equiv$ is equality for Booleans. We can replace $\equiv$ by $=$, but not vice versa. (We can write FALSE $=\neg$ TRUE, but not $2+2 \equiv 4$.) Writing $\equiv$ instead of $=$ makes it clear that the equal expressions are Booleans. ${ }^{5}$

Formulas of propositional logic are made up of values, operators, variables, and parentheses just like those of algebra. In algebraic formulas, $*$ has higher precedence (binds more tightly) than + , so $x+y * z$ means $x+(y * z)$. Similarly, $\neg$ has higher precedence than $\wedge$ and $\vee$, which have higher precedence than $\Rightarrow$ and $\equiv$, so $\neg F \wedge G \Rightarrow H$ means $((\neg F) \wedge G) \Rightarrow H$. Other mathematical operators like + and $>$ have higher precedence than the operators of propositional logic, so $n>0 \Rightarrow n-1 \geq 0$ means $(n>0) \Rightarrow(n-1 \geq 0)$. Redundant parentheses can't hurt and often make a formula easier to read. If you have any doubt about whether parentheses are needed, use them.

The operators $\wedge$ and $\vee$ are associative, just like + and $*$. Associativity of + means that $x+(y+z)$ equals $(x+y)+z$, so we can write $x+y+z$ without parentheses. Similarly, associativity of $\wedge$ and $\vee$ lets us write $F \wedge G \wedge H$ or $F \vee G \vee H$.

A tautology of propositional logic is a formula like $(F \Rightarrow G) \equiv(\neg F \vee G)$ that is true for all possible truth values of its variables. One can prove all tautologies from a few simple axioms and rules. However, that would be like computing $437+256$ from the axioms of arithmetic. It's much easier to verify that a simple formula is a tautology by writing its truth table - that is, by directly calculating the value of the formula for all possible truth values of its components. The formula is a tautology iff it equals TRUE for all these values. To construct the truth table

[^1]for a formula, we construct the truth table for all its subformulas. For example, the following truth table shows that $(F \Rightarrow G) \equiv(\neg F \vee G)$ is indeed a tautology.

| $F$ | $G$ |  | $F \Rightarrow G \Rightarrow$ |  | $\neg F$ |  | $\neg F \vee G$ |  | $(F \Rightarrow G) \equiv \neg F \vee G$ |
| :--- | :--- | :--- | :--- | :--- | :--- | :---: | :---: | :---: | :---: |
| TRUE | TRUE | TRUE | FALSE | TRUE | TRUE |  |  |  |  |
| TRUE | FALSE | FALSE | FALSE | FALSE | TRUE |  |  |  |  |
| FALSE | TRUE | TRUE | TRUE | TRUE | TRUE |  |  |  |  |
| FALSE | FALSE | TRUE | TRUE | TRUE | TRUE |  |  |  |  |

Propositional logic is the basis of all mathematical reasoning. It should be as familiar to you as simple algebra. Checking that $\neg(F \vee G) \equiv \neg F \wedge \neg G$ is a tautology should be as easy as checking that $2 *(x+3 * y)$ equals $2 * x+6 * y$.

Although propositional logic is simple, complex propositional formulas can get confusing. You may find yourself trying to simplify some formula and not being sure if the simplified version means the same thing as the original. There exist a number of programs for verifying propositional logic tautologies; some of them can be found and used on the World Wide Web.

### 2.2 Sets

Set theory is the foundation of ordinary mathematics. A set is often described as a collection of elements, but saying that a set is a collection doesn't explain very much. The concept of set is so fundamental that we don't try to define it. We take as undefined concepts the notion of a set and the relation $\in$, where $x \in S$ means that $x$ is an element of $S$. We often say is in instead of is an element of.

A set can have a finite or infinite number of elements. The set of all natural numbers ( 0 , 1,2 , etc.) is an infinite set. The set of all natural numbers less than 3 is finite, and contains the three elements 0,1 , and 2 . We can write this set $\{0,1,2\}$.

A set is completely determined by its elements. Two sets are equal iff they have the same elements. Thus, $\{0,1,2\}$ and $\{2,1,0\}$ and $\{0,0,1,2,2\}$ are all the same set-the unique set containing the three elements 0,1 , and 2 . The empty set, which we write $\}$, is the unique set that has no elements.

The most common operations on sets are:
$\cap$ intersection $\cup$ union $\subseteq$ subset $\backslash$ set difference
Here are their definitions and examples of their use:
$S \cap T$ The set of elements in both $S$ and $T . \quad\{1,-1 / 2,3\} \cap\{1,2,3,5,7\}=\{1,3\}$
$S \cup T$ The set of elements in $S$ or $T$ (or both). $\{1,-1 / 2\} \cup\{1,5,7\}=\{1,-1 / 2,5,7\}$
$S \subseteq T$ True iff every element of $S$ is an element of $T . \quad\{1,3\} \subseteq\{3,2,1\}$
$S \backslash T$ The set of elements in $S$ that are not in $T . \quad\{1,-1 / 2,3\} \backslash\{1,5,7\}=\{-1 / 2,3\}$
This is all you need to know about sets before we start looking at how to specify systems. We'll return to set theory in Section 7.1.

### 2.3 Predicate Logic

Once we have sets, it's natural to say that some formula is true for all the elements of a set, or for some of the elements of a set. Predicate logic extends propositional logic with the two quantifiers:

```
\forall universal quantification (for all)
\exists existential quantification (there exists)
```

The formula $\forall x \in S: F$ asserts that formula $F$ is true for every element $x$ in the set $S$. For example, $\forall n \in N a t: n+1>n$ asserts that the formula $n+1>n$ is true for all elements $n$ of the set Nat of natural numbers. This formula happens to be true.

The formula $\exists x \in S: F$ asserts that formula $F$ is true for at least one element $x$ in $S$. For example, $\exists n \in N a t: n^{2}=2$ asserts that there exists a natural number $n$ whose square equals 2. This formula happens to be false.

Formula $F$ is true for all $x \in S$ iff there is no $x \in S$ for which $F$ is false, which is true iff there is no $x \in S$ for which $\neg F$ is true. Hence, the formula
(1) $(\exists x \in S: F) \equiv \neg(\forall x \in S: \neg F)$
is a tautology of predicate logic. ${ }^{6}$
Since there exists no element in the empty set, the formula $\exists x \in\}: F$ is false for every formula $F$. By (1), this implies that $\forall x \in\}: F$ must be true for every $F$.

The quantification in the formulas $\forall x \in S: F$ and $\exists x \in S: F$ is said to be bounded, since these formulas make an assertion only about elements in the set $S$. There is also unbounded quantification. The formula $\forall x: F$ asserts that $F$ is true for all values $x$, and $\exists x: F$ asserts that $F$ is true for at least one value of $x$-a value that is not constrained to be in any particular set. Bounded and unbounded quantification are related by the following tautologies:

$$
\begin{aligned}
& (\forall x \in S: F) \equiv(\forall x:(x \in S) \Rightarrow F) \\
& (\exists x \in S: F) \equiv(\exists x:(x \in S) \wedge F)
\end{aligned}
$$

The analog of (1) for unbounded quantifiers is also a tautology:

$$
(\exists x: F) \equiv \neg(\forall x: \neg F)
$$

Whenever possible, it is better to use bounded than unbounded quantification in a specification. This makes the specification easier for both people and tools to understand.

Logicians have rules for proving such predicate-logic tautologies, but you shouldn't need them. You should become familiar enough with predicate logic that simple tautologies are obvious. It can help to think of $\forall x \in S: F$ as the conjunction of the formulas obtained by substituting all possible elements of $S$ for $x$ in $F$. The associativity and commutativity of conjunction then lead to the tautology:

$$
(\forall x \in S: F) \wedge(\forall x \in S: G) \equiv(\forall x \in S: F \wedge G)
$$

Similarly, you can think of $\exists x \in S: F$ as the disjunction of formulas, so associativity and commutativity of disjunction imply:

$$
(\exists x \in S: F) \vee(\exists x \in S: G) \equiv(\exists x \in S: F \vee G)
$$

for any set $S$ and formulas $F$ and $G$.
Mathematicians use some obvious abbreviations for nested quantifiers. For example:

```
\forallx\inS,y\inT:F means }\forallx\inS:(\forally\inT:F
\existsw,x,y,z\inS:F means \existsw\inS:(\existsx\inS:(\existsy\inS:(\existsz\inS:F)))
```

In the expression $\exists x \in S: F$, logicians say that $x$ is a bound variable and that occurrences of $x$ in $F$ are bound. For example, $n$ is a bound variable in the formula $\exists n \in N a t: n+1>n$, and the two occurrences of $n$ in the subexpression $n+1>n$ are bound. A variable $x$ that's not bound is said to be free, and occurrences of $x$ that are not bound are called free occurrences.

[^2]This terminology is rather misleading. A bound variable doesn't really occur in a formula because replacing it by some new variable doesn't change the formula. The two formulas

$$
\exists n \in N a t: n+1>n \quad \exists x \in N a t: x+1>x
$$

are completely equivalent. Calling $n$ a variable of the first formula is a bit like calling $a$ a variable of that formula because it appears in the name Nat. Although misleading, this terminology is common and often convenient.

## 3 Specifying a Simple Clock

### 3.1 Behaviors

Before we try to specify a system, let's look at how scientists do it. For centuries, they have described a system with equations that determine how its state evolves with time, where the state consists of the values of variables. For example, the state of the system comprising the earth and the moon might be described by the values of the four variables $e_{-} p o s, m_{-} p o s$, $e_{-}$vel, and m_vel, representing the positions and velocities of the two bodies. These values are elements in a 3 -dimensional space. The earth-moon system is described by equations expressing the variables' values as functions of time and of certain constants-namely, their masses and initial positions and velocities.

A behavior of the earth-moon system consists of a function $F$ from time to states, $F(t)$ representing the state of the system at time $t$. A computer system differs from the systems traditionally studied by scientists because we can pretend that its state changes in discrete steps. So, we represent the execution of a system as a sequence of states. Formally, we define a behavior to be a sequence of states, where a state is an assignment of values to variables. We specify a system by specifying a set of possible behaviors - the ones representing a correct execution of the system.

### 3.2 An Hour Clock

Let's start with a very trivial system - a digital clock that displays only the hour. To make the system completely trivial, we ignore the relation between the display and the actual time. The hour clock is then just a device whose display cycles through the values 1 through 12. Let the variable $h r$ represent the clock's display. A typical behavior of the clock is the sequence

$$
\begin{equation*}
[h r=11] \rightarrow[h r=12] \rightarrow[h r=1] \rightarrow[h r=2] \rightarrow \cdots \tag{2}
\end{equation*}
$$

of states, where $[h r=11]$ is a state in which the variable $h r$ has the value 11. A pair of successive states, such as $[h r=1] \rightarrow[h r=2]$, is called a step.

To specify the hour clock, we describe all its possible behaviors. We write an initial predicate that specifies the possible initial values of $h r$, and a next-state relation that specifies how the value of $h r$ can change in any step.

We don't want to specify exactly what the display reads initially; any hour will do. So, we want the initial predicate to assert that $h r$ can have any value from 1 through 12 . Let's call the initial predicate HCini. We might informally define HCini by:

$$
\text { HCini } \triangleq h r \in\{1, \ldots, 12\}
$$

Later, we'll see how to write this definition formally, without the "..." that stands for the informal and so on.

The next-state relation HCnxt is a formula expressing the relation between the values of $h r$ in the old (first) state and new (second) state of a step. We let $h r$ represent the value of $h r$ in the old state and $h r^{\prime}$ represent its value in the new state. (The ' in $h r^{\prime}$ is read prime.) We
want the next-state relation to assert that $h r^{\prime}$ equals $h r+1$ except if $h r$ equals 12 , in which case $h r^{\prime}$ should equal 1. Using an IF/THEN/ELSE construct with the obvious meaning, we can define $H C n x t$ to be the next-state relation by writing:

$$
\text { HCnxt } \triangleq h r^{\prime}=\mathrm{IF} h r \neq 12 \text { THEN } h r+1 \text { ELSE } 1
$$

$H C n x t$ is an ordinary mathematical formula, except that it contains primed as well as unprimed variables. Such a formula is called an action. An action is true or false of a step. A step that satisfies the action HCnxt is called an HCnxt step.

When an HCnxt step occurs, we sometimes say that HCnxt is executed. However, it would be a mistake to take this terminology seriously. An action is a formula, and formulas aren't executed.

We want our specification to be a single formula, not the pair of formulas HCini and $H C n x t$. We write this formula with the temporal-logic operator $\square$ (pronounced box). The temporal formula $\square F$ asserts that formula $F$ is always true. In particular, $\square H C n x t$ is the assertion that HCnxt is true for every step in the behavior. A formula without a $\square$ is an assertion about the beginning of the behavior. So, HCini $\wedge \square H C n x t$ is true of a behavior iff the initial state satisfies HCini and every step satisfies HCnxt. This formula describes all behaviors like the one in (2); it seems to be the specification we're looking for.

If we considered the clock only in isolation and never tried to relate it to another system, then this would be a fine specification. However, suppose the clock is part of a larger system - for example, the hour display of a weather station that displays the current hour and temperature. The state of the station is described by two variables: $h r$, representing the hour display, and tmp, representing the temperature display. Consider this behavior of the weather station:

$$
\begin{aligned}
& {\left[\begin{array}{l}
h r=11 \\
\operatorname{tmp}=23.5
\end{array}\right] \rightarrow\left[\begin{array}{l}
h r=12 \\
t m p=23.5
\end{array}\right] \rightarrow\left[\begin{array}{l}
h r=12 \\
t m p=23.4
\end{array}\right] \rightarrow\left[\begin{array}{l}
h r=12 \\
t m p=23.3
\end{array}\right] \rightarrow} \\
& {\left[\begin{array}{l}
h r=1 \\
t m p=23.3
\end{array}\right] \rightarrow \cdots}
\end{aligned}
$$

In the second and third steps, tmp changes but $h r$ remains the same. These steps are not allowed by $\square H C n x t$, which asserts that every step must increment $h r$. The formula $H C i n i \wedge$ $\square H C n x t$ does not describe the hour clock in the weather station.

A formula that describes any hour clock must allow steps that leave $h r$ unchanged-in other words, $h r^{\prime}=h r$ steps. These are called stuttering steps of the clock. A specification of the hour clock should allow both HCnxt steps and stuttering steps. So, a step should be allowed iff it is either an HCnxt step or a stuttering step-that is, iff it is a step satisfying $H C n x t \vee\left(h r^{\prime}=h r\right)$. This suggests that we adopt HCini $\wedge \square\left(H C n x t \vee\left(h r^{\prime}=h r\right)\right)$ as our specification. In TLA, we let $[H C n x t]_{h r}$ stand for $H C n x t \vee\left(h r^{\prime}=h r\right)$, so we can write the formula more compactly as $H C i n i \wedge \square[H C n x t]_{h r}$.

The formula $H C i n i \wedge \square[H C n x t]_{h r}$ does allow stuttering steps. In fact, it allows the behavior

$$
[h r=11] \rightarrow[h r=12] \rightarrow[h r=12] \rightarrow[h r=12] \rightarrow[h r=12] \rightarrow \cdots
$$

that ends with an infinite sequence of stuttering steps. This behavior describes a clock whose display attains the value 12 and then keeps that value forever-in other words, a clock that stops at 12. In a like manner, we can represent a terminating execution of any system by an infinite behavior that ends with a sequence of nothing but stuttering steps. We have no need of finite behaviors (finite sequences of states), so we consider only infinite ones.

It's natural to require that a clock does not stop, so our specification should assert that there are infinitely many nonstuttering steps. Section 8 explains how to express this requirement. For now, we content ourselves with clocks that may stop, and we take as our specification of an hour clock the formula $H C$ defined by

$$
H C \triangleq H C i n i \wedge \square[H C n x t]_{h r}
$$

### 3.3 A Closer Look at the Hour-Clock Specification

A state is an assignment of values to variables, but what variables? The answer is simple: all variables. In the behavior $(2),[h r=1]$ represents some particular state that assigns the value 1 to $h r$. It might assign the value 23 to the variable $t m p$ and the value $\sqrt{-17}$ to the variable $m \_p o s$. We can think of a state as representing a potential state of the entire universe. A state that assigns 1 to $h r$ and a particular point in 3-space to $m_{-} p o s$ describes a state of the universe in which the hour clock reads 1 and the moon is in a particular place. A state that assigns $\sqrt{-2}$ to $h r$ doesn't correspond to any state of the universe that we recognize, because the hour-clock can't display the value $\sqrt{-2}$. It might represent the state of the universe after a bomb fell on the clock, making its display purely imaginary.

A behavior is an infinite sequence of states-for example:

$$
\begin{equation*}
[h r=11] \rightarrow[h r=77.2] \rightarrow[h r=78.2] \rightarrow[h r=\sqrt{-2}] \rightarrow \cdots \tag{3}
\end{equation*}
$$

A behavior describes a potential history of the universe. The behavior (3) doesn't correspond to a history that we understand, because we don't know how the clock's display can change from 11 to 77.2 . Whatever kind of history it represents is not one in which the clock is doing what it's supposed to.

Formula $H C$ is a temporal formula. A temporal formula is an assertion about behaviors. We say that a behavior satisfies $H C$ iff $H C$ is a true assertion about the behavior. Behavior (2) satisfies formula $H C$. Behavior (3) does not, because $H C$ asserts that every step satisfies $H C n x t$, and the first and third steps of (3) don't. (The second step, $[h r=77.2] \rightarrow[h r=78.2]$, does satisfy $H C n x t$.) We regard formula $H C$ to be the specification of an hour clock because it is satisfied by exactly those behaviors that represent histories of the universe in which the clock functions properly.

If the clock is behaving properly, then its display should be an integer from 1 through 12. So, $h r$ should be an integer from 1 through 12 in every state of any behavior satisfying the clock's specification, $H C$. Formula $H C i n i$ asserts that $h r$ is an integer from 1 through 12, and $\square H C i n i$ asserts that $H C i n i$ is always true. So, $\square H C i n i$ should be true for any behavior satisfying $H C$. Another way of saying this is that $H C$ implies $\square H C i n i$, for any behavior.
 satisfied by every behavior is called a theorem, so $H C \Rightarrow \square H C i n i$ should be a theorem. It's easy to see that it is: $H C$ implies that $H C i n i$ is true initially (in the first state of the behavior), and $\square[H C n x t]_{h r}$ implies that each step either advances $h r$ to its proper next value or else leaves $h r$ unchanged. We can formalize this reasoning using the proof rules of TLA, but I'm not going to delve into proofs and proof rules.

### 3.4 The Hour-Clock Specification in TLA ${ }^{+}$

Figure 1 shows how the hour clock specification can be written in TLA ${ }^{+}$. There are two versions: the ASCII version on the bottom is the actual TLA+ specification, the way you type it; the top version is typeset the way a "pretty-printer" might display it. Before trying to understand the specification, observe the relation between the two syntaxes:

- Reserved words that appear in small upper-case letters (like EXTENDS) are written in ASCII with ordinary upper-case letters.
- When possible, symbols are represented pictorially in ASCII-for example, $\square$ is typed as [] and $\neq$ as \#. (You can also type $\neq$ as $/=$.)
- When there is no good ASCII representation, $\mathrm{T}_{\mathrm{E}} \mathrm{X}$ notation [5] is used-for example, $\in$ is typed as \in.

```
                                    module HourClock
    Extends Naturals
    variable hr
    HCini \triangleq hr ( (1.. 12)
    HCnxt}\triangleqh\mp@subsup{r}{}{\prime}= IF hr\not=12 THEN hr +1 ELSE 1/
    HC\triangleq HCini}\wedge\square[HCnxt] hr
    THEOREM HC = \squareHCini
```

```
----------------------- MODULE HourClock ----------------------------
```

----------------------- MODULE HourClock ----------------------------
EXTENDS Naturals
EXTENDS Naturals
VARIABLE hr
VARIABLE hr
HCini == hr \in (1 . . 12)
HCini == hr \in (1 . . 12)
HCnxt == hr' = IF hr \# 12 THEN hr + 1 ELSE 1
HCnxt == hr' = IF hr \# 12 THEN hr + 1 ELSE 1
HC == HCini /\ [][HCnxt]_hr
HC == HCini /\ [][HCnxt]_hr
-----------------------------------------------------------------------
-----------------------------------------------------------------------
THEOREM HC => [] HCini
THEOREM HC => [] HCini
====================================================================

```
====================================================================
```

Figure 1: The hour clock specification-typeset and ASCII versions.

A complete list of symbols and their ASCII equivalents appears at the end of this chapter in Figure 15. I will usually show the typeset version of a specification.

Now let's look at what the specification says. It starts with

```
\squareMODULE HourClock
```

which begins a module named HourClock. TLA ${ }^{+}$specifications are partitioned into modules; the hour clock's specification consists of this single module.

Arithmetic operators like + are not built into $\mathrm{TLA}^{+}$, but are themselves defined in modules. (You might want to write a specification in which + means addition of matrices rather than numbers.) The usual operators on natural numbers are defined in the Naturals module. Their definitions are incorporated into module HourClock by the statement

## EXTENDS Naturals

Every symbol that appears in a formula must either be a built-in operator of TLA ${ }^{+}$, or else it must be declared or defined. The statement

## VARIABLE $h r$

declares $h r$ to be a variable.
To define $H C$ ini, we need to express the set $\{1, \ldots, 12\}$ formally, without the ellipsis "...". We can write this set out completely as

$$
\{1,2,3,4,5,6,7,8,9,10,11,12\}
$$

but that's tiresome. Instead, we use the operator .. defined in the Naturals modules to write this set as $1 \ldots 12$. In general $i \ldots j$ is the set of integers from $i$ through $j$, for any integers $i$ and $j$. (It equals the empty set if $j<i$.) It's now obvious how to write the definition of HCini. The definitions of HCnxt and $H C$ are written just as before. (The ordinary mathematical operators of logic and set theory, like $\wedge$ and $\in$, and the IF construct are built into TLA ${ }^{+}$.)

The line
can appear anywhere between statements; it's purely cosmetic and has no meaning. Following it is the statement

$$
\text { THEOREM } H C \Rightarrow \square H C i n i
$$

of the theorem that was discussed above. This statement asserts that the formula $H C \Rightarrow$ $\square H C i n i$ is true in the context of the statement. More precisely, it asserts that the formula follows logically from the definitions in this module, the definitions in the Naturals module, and the rules of TLA ${ }^{+}$. If the formula were not true, then the module would be incorrect.

The module is terminated by the symbol

The specification of the hour clock is the definition of $H C$, including the definitions of the formulas HCnxt and HCini and of the operators .. and + that appear in the definition of HC. Formally, nothing in the module tells us that HC rather than HCini is the clock's specification. TLA ${ }^{+}$is a language for writing mathematics - in particular, for writing mathematical definitions and theorems. What those definitions represent, and what significance we attach to those theorems, lies outside the scope of mathematics and therefore outside the scope of TLA ${ }^{+}$. Engineering requires not just the ability to use mathematics, but the ability to understand what, if anything, the mathematics tells us about an actual system.

### 3.5 Another Way to Specify the Hour Clock

The Naturals module also defines the modulus operator, which we write \%. The formula $i \% n$, which mathematicians write $i \bmod n$, is the remainder when $i$ is divided by $n$. More formally, $i \% n$ is the natural number less than $n$ satisfying $i=q * n+(i \% n)$ for some natural number $q$. Let's express this condition mathematically. The Naturals module defines Nat to be the set of natural numbers, and the assertion that there exists a $q$ in the set Nat satisfying a formula $F$ is written $\exists q \in N a t: F$. Thus, if $i$ and $n$ are elements of Nat and $n>0$, then $i \% n$ is the unique number satisfying

$$
(i \% n \in 0 \ldots(n-1)) \wedge(\exists q \in N a t: i=q * n+(i \% n))
$$

We can use \% to simplify our hour-clock specification a bit. Observing that ( $11 \% 12$ ) +1 equals 12 and $(12 \% 12)+1$ equals 1 , we can define a different next-state action $H C n x t 2$ and a different formula $H C 2$ to be the clock specification:

$$
H C n x t 2 \triangleq h r^{\prime}=(h r \% 12)+1 \quad H C 2 \triangleq H C i n i \wedge \square[H C n x t 2]_{h r}
$$

Actions HCnxt and HCnxt2 are not equivalent. The step $[h r=24] \rightarrow[h r=25]$ satisfies HCnxt but not HCnxt2, while the step $[h r=24] \rightarrow[h r=1]$ satisfies HCnxt2 but not HCnxt. However, any step starting in a state with $h r$ in $1 . .12$ satisfies HCnxt iff it satisfies $H C n x t 2$. It's therefore not hard to deduce that any behavior starting in a state satisfying HCini satisfies $\square[H C n x t]_{h r}$ iff it satisfies $\square[H C n x t 2]_{h r}$. Hence, formulas $H C$ and $H C 2$ are equivalent. It doesn't matter which of them we take to be the specification of an hour clock.

Mathematics provides infinitely many ways of expressing the same thing. The expressions $6+6,3 * 4$, and $141-129$ all have the same meaning; they are just different ways of writing the number 12. We could replace either instance of the number 12 in module HourClock by any of these expressions without changing the meaning of any of the module's formulas.

When writing a specification, you will often be faced with a choice of how to express something. When that happens, you should first make sure that the choices yield equivalent specifications. If they do, then you can choose the one that you feel makes the specification easiest to understand. If they don't, then you must decide which one you mean.

## 4 An Asynchronous Interface

We now specify an interface for transmitting data between asynchronous devices. A sender and a receiver are connected as shown here:


Data is sent on val, and the $r d y$ and $a c k$ lines are used for synchronization. The sender must wait for an acknowledgement (an $A c k$ ) for one data item before it can send the next. The interface uses the standard two-phase handshake protocol, described by the following sample behavior.

$$
\begin{aligned}
& {\left[\begin{array}{l}
v a l=26 \\
r d y=0 \\
\text { ack }=0
\end{array}\right] \xrightarrow{\text { Send } 37}\left[\begin{array}{l}
v a l=37 \\
r d y=1 \\
a c k=0
\end{array}\right] \xrightarrow{\text { Ack }}\left[\begin{array}{l}
v a l=37 \\
r d y=1 \\
\text { ack }=1
\end{array}\right] \xrightarrow{\text { Send } 4}\left[\begin{array}{l}
v a l=4 \\
r d y=0 \\
a c k=1
\end{array}\right] \xrightarrow{\text { Ack }}} \\
& {\left[\begin{array}{l}
\text { val }=4 \\
r d y=0 \\
a c k=0
\end{array}\right] \xrightarrow{\text { Send } 19}\left[\begin{array}{l}
v a l=19 \\
r d y=1 \\
a c k=0
\end{array}\right] \xrightarrow{\text { Ack }} \ldots}
\end{aligned}
$$

(It doesn't matter what value val has in the initial state.)
It's easy to see from this sample behavior what the set of all possible behaviors should be - once we decide what the data values are that can be sent. But, before writing the TLA ${ }^{+}$ specification that describes these behaviors, let's look at what I've just done.

In writing this behavior, I made the decision that val and $r d y$ should change in a single step. The values of the variables val and rdy represent voltages on some set of wires in the physical device. Voltages on different wires don't change at precisely the same instant. I decided to ignore this aspect of the physical system and pretend that the values of val and $r d y$ represented by those voltages change instantaneously. This simplifies the specification, but at the price of ignoring what may be an important detail of the system. In an actual implementation of the protocol, the voltage on the $r d y$ line shouldn't change until the voltages on the val lines have stabilized; but you won't learn that from my specification. Had I wanted the specification to convey this requirement, I would have written a behavior in which the value of $v a l$ and the value of $r d y$ change in separate steps.

A specification is an abstraction. It describes some aspects of the system and ignores others. We want the specification to be as simple as possible, so we want to ignore as many details as we can. But, whenever we omit some aspect of the system from the specification, we admit a potential source of error. With my specification, we can verify the correctness of a system that uses this interface, and the system could still fail because the implementor didn't know that the val line should stabilize before the $r d y$ line is changed.

The hardest part of writing a specification is choosing the proper abstraction. I can teach you about TLA ${ }^{+}$, so expressing an abstract view of a system as a TLA ${ }^{+}$specification becomes a straightforward task. But, I don't know how to teach you about abstraction. A good engineer knows how to abstract the essence of a system and suppress the unimportant details when specifying and designing it. The art of abstraction is learned only through experience.

When writing a specification, you must first choose the abstraction. In a TLA ${ }^{+}$specification, this means choosing (i) the variables that represent the system's state and (ii) the granularity of the steps that change those variables' values. Should the rdy and ack lines be represented as separate variables or as a single variable? Should val and $r d y$ change in one
step, two steps, or an arbitrary number of steps? To help make these choices, I recommend that you start by writing the first few steps of one or two sample behaviors, just as I did at the beginning of this section. Section 9 has more to say about these choices.

### 4.1 The First Specification

Now let's specify the interface with a module AsynchInterface. The variables $r d y$ and ack can assume the values 0 and 1, which are natural numbers, so our module ExTENDS the Naturals module. We next decide what the possible values of val should be - that is, what data values may be sent. We could write a specification that places no restriction on the data values. The specification could allow the sender to first send 37 , then send $\sqrt{-15}$, and then send Nat (the entire set of natural numbers). However, any real device can send only a restricted set of values. We could pick some specific set-for example, 32 -bit numbers. However, the protocol is the same regardless of whether it's used to send 32 -bit numbers or 128 -bit numbers. So, we compromise between the two extremes of allowing anything to be sent and allowing only 32-bit numbers to be sent by assuming only that there is some set Data of data values that may be sent. The constant Data is a parameter of the specification. It's declared by the statement

## CONSTANT Data

Our three variables are declared by

## VARIABLES val, rdy, ack

The keywords variable and variables are synonymous, as are constant and constants.
The variable $r d y$ can assume any value - for example, $-1 / 2$. That is, there exist states that assign the value $-1 / 2$ to $r d y$. When discussing the specification, we usually say that $r d y$ can assume only the values 0 and 1 . What we really mean is that the value of $r d y$ equals 0 or 1 in every state of any behavior satisfying the specification. But, a reader of the specification shouldn't have to understand the complete specification to figure this out. We can make the specification easier to understand by telling the reader what values the variables can assume in a behavior that satisfies the specification. We could do this with comments, but I prefer to use a definition like this one:

$$
\text { TypeInvariant } \triangleq(\text { val } \in \text { Data }) \wedge(r d y \in\{0,1\}) \wedge(\text { ack } \in\{0,1\})
$$

I call the set $\{0,1\}$ the type of rdy, and I call TypeInvariant a type invariant. Let's define type and some other terms more precisely:

- A state function is an ordinary expression (one with no prime or $\square$ ) that can contain variables and constants.
- A state predicate is a Boolean-valued state function.
- An invariant Inv of a specification Spec is a state predicate such that Spec $\Rightarrow \square I n v$ is a theorem.
- A variable $v$ has type $T$ in a specification Spec iff $v \in T$ is an invariant of Spec.

We can make the definition of TypeInvariant easier to read by writing it as follows.

$$
\begin{aligned}
\text { TypeInvariant } \triangleq & \wedge \text { val } \in \text { Data } \\
& \wedge r d y \in\{0,1\} \\
& \wedge a c k \in\{0,1\}
\end{aligned}
$$

Each conjunct begins with a $\wedge$ and must lie completely to the right of that $\wedge$. (The conjunct may occupy multiple lines). We use a similar notation for disjunctions. When using this bulleted-list notation, the $\wedge$ 's or $\vee$ 's must line up precisely (even in the ASCII input). Because the indentation is significant, we can eliminate parentheses, making this notation especially useful when conjunctions and disjunctions are nested.

The initial predicate is straightforward. Initially, val can equal any element of Data. We can start with $r d y$ and ack either both 0 or both 1 .

$$
\begin{aligned}
\text { Init } \triangleq & \wedge v a l \in D a t a \\
& \wedge r d y \in\{0,1\} \\
& \wedge a c k=r d y
\end{aligned}
$$

Now for the next-state action Next. A step of the protocol either sends a value or receives a value. We define separately the two actions Send and Rcv that describe the sending and receiving of a value. A Next step (one satisfying action Next) is either a Send step or a Rcv step, so it is a Send $\vee R c v$ step. Therefore, Next is defined to equal Send $\vee R c v$. Let's now define Send and Rcv.

We say that action Send is enabled in a state from which it is possible to take a Send step. From the sample behavior above, we see that Send is enabled iff $r d y$ equals act. Usually, the first question we ask about an action is, when is it enabled? So, the definition of an action usually begins with its enabling condition. The first conjunct in the definition of Send is therefore $r d y=a c k$. The next conjuncts tell us what the new values of the variables val, $r d y$, and ack are. The new value val' of val can be any element of Data-that is, any value satisfying val ${ }^{\prime} \in$ Data. The value of $r d y$ changes from 0 to 1 or from 1 to 0 , so $r d y^{\prime}$ equals $1-r d y$ (because $1=1-0$ and $0=1-1$ ). The value of $a c k$ is left unchanged. TLA ${ }^{+}$defines unchanged $v$ to mean that the expression $v$ has the same value in the old and new states. More precisely, unchanged $v$ equals $v^{\prime}=v$, where $v^{\prime}$ is the expression obtained from $v$ by priming all variables. So, we define Send by:

$$
\begin{aligned}
\text { Send } \triangleq & \wedge r d y=a c k \\
& \wedge v a l^{\prime} \in D a t a \\
& \wedge r d y^{\prime}=1-r d y \\
& \wedge \text { UNCHANGED } a c k
\end{aligned}
$$

(I could have written $a c k^{\prime}=a c k$ instead of UNChANGED $a c k$, but I prefer the UNCHANGED construct.)

A $R c v$ step is enabled iff $r d y$ is different from ack; it complements the value of ack and leaves $r d y$ and ack unchanged. Both $r d y$ and ack are left unchanged iff the pair of values $r d y$, $a c k$ is left unchanged. TLA ${ }^{+}$uses angle brackets 〈 and 〉 to enclose ordered tuples, so Rcv asserts that $\langle r d y, a c k\rangle$ is left unchanged. (Angle brackets are typed in ASCII as << and >>.) The definition of $R c v$ is therefore:

$$
\begin{aligned}
R c v \triangleq & \wedge r d y \neq a c k \\
& \wedge a c k^{\prime}=1-a c k \\
& \wedge \text { UNCHANGED }\langle v a l, r d y\rangle
\end{aligned}
$$

As in our clock example, the complete specification Spec should allow stuttering steps-in this case, ones that leave all three variables unchanged. So, Spec allows steps that leave $\langle v a l, r d y, a c k\rangle$ unchanged. Its definition is

$$
\text { Spec } \triangleq \text { Init } \wedge \square[N e x t]_{\langle v a l, r d y, a c k\rangle}
$$

Module AsynchInterface also asserts the invariance of TypeInvariant. It appears in full in Figure 2.
constant Data

$$
\text { VARIABLES } v a l, r d y, a c k
$$

$$
\text { TypeInvariant } \triangleq \wedge v a l \in \text { Data }
$$

$$
\wedge r d y \in\{0,1\}
$$

$$
\wedge \text { ack } \in\{0,1\}
$$

$$
\text { Init } \triangleq \wedge \text { val } \in \text { Data }
$$

$$
\wedge r d y \in\{0,1\}
$$

$$
\wedge a c k=r d y
$$

$$
\text { Send } \triangleq \wedge r d y=a c k
$$

$$
\wedge v a l^{\prime} \in D a t a
$$

$$
\wedge r d y^{\prime}=1-r d y
$$

$$
\wedge \text { UNCHANGED } a c k
$$

$R c v \triangleq \wedge r d y \neq a c k$
$\wedge a c k^{\prime}=1-a c k$
$\wedge$ UNCHANGED $\langle v a l, r d y\rangle$
Next $\triangleq$ Send $\vee R c v$
Spec $\triangleq$ Init $\wedge \square[N e x t]_{\langle v a l, r d y, a c k\rangle}$
THEOREM Spec $\Rightarrow \square$ TypeInvariant

Figure 2: The First Specification of an Asynchronous Interface

### 4.2 Another Specification

Module AsynchInterface is a fine description of the interface and its handshake protocol. However, it's not easy to use it to help specify a system that uses the interface. Let's rewrite the interface specification in a form that makes it more convenient to use as part of a larger specification.

The first problem with the original specification is that it uses three variables to describe a single interface. A system might use several different instances of the interface. To avoid a proliferation of variables, we replace the three variables val, rdy, ack with a single variable chan (short for channel). A mathematician would do this by letting the value of chan be an ordered triple - for example, a state $[$ chan $=\langle-1 / 2,0,1\rangle$ ] might replace the state with $v a l=-1 / 2, r d y=0$, and $a c k=1$. But programmers have learned that using tuples like this leads to mistakes; it's easy to forget if the ack line is represented by the second or third component. TLA ${ }^{+}$therefore provides records in addition to more conventional mathematical notation.

Let's represent the state of the channel as a record with val, $r d y$, and $a c k$ fields. If $r$ is such a record, then r.val is its val field. The type invariant asserts that the value of chan is an element of the set of all such records $r$ in which r.val is an element of the set Data and $r . r d y$ and $r$.ack are elements of the set $\{0,1\}$. This set of records is written:

$$
[\text { val }: \text { Data, rdy }:\{0,1\}, \text { ack }:\{0,1\}]
$$

The components of a record are not ordered, so it doesn't matter in what order we write them. This same set of records can also be written as:

$$
[\text { ack }:\{0,1\}, \text { val : Data, rdy }:\{0,1\}]
$$

Initially, chan can equal any element of this set whose $a c k$ and $r d y$ fields are equal, so the initial predicate is the conjunction of the type invariant and the condition chan.ack $=$ chan.rdy.

A system that uses the interface may perform an operation that sends some data value $d$ and performs some other changes that depend on the value $d$. We'd like to represent such an operation as an action that is the conjunction of two separate actions: one that describes the sending of $d$ and the other that describes the other changes. Thus, instead of defining an action Send that sends some unspecified data value, we define the action Send (d) that sends data value $d$. The next-state action is satisfied by a $\operatorname{Send}(d)$ step, for some $d$ in Data, or a $R c v$ step. (The value received by a $R c v$ step equals chan.val.) Saying that a step is a Send (d) step for some $d$ in Data means that there exists a $d$ in Data such that the step satisfies $\operatorname{Send}(d)$-in other words, that the step is an $\exists d \in \operatorname{Data}: \operatorname{Send}(d)$ step. So we define

$$
N e x t \triangleq(\exists d \in \operatorname{Data}: \operatorname{Send}(d)) \vee R c v
$$

The Send (d) action asserts that chan ${ }^{\prime}$ equals the record $r$ such that:

$$
r . v a l=d \quad r . r d y=1-\text { chan.rdy } \quad r . a c k=\text { chan.ack }
$$

This record is written in TLA ${ }^{+}$as:

$$
[v a l \mapsto d, \quad r d y \mapsto 1-\text { chan.rdy, ack } \mapsto \text { chan.ack }]
$$

(The symbol $\mapsto$ is typed in ASCII as I->.) The fields of records are not ordered, so this record can just as well be written:

$$
[\text { ack } \mapsto \text { chan.ack, } \quad \text { val } \mapsto d, \quad r d y \mapsto 1-\text { chan.rdy }]
$$

The enabling condition of $\operatorname{Send}(d)$ is that the $r d y$ and $a c k$ lines are equal, so we can define:

$$
\begin{aligned}
\operatorname{Send}(d) \triangleq & \wedge \text { chan.rdy }=\text { chan.ack } \\
& \wedge \text { chan }{ }^{\prime}=[\text { val } \mapsto d, \text { rdy } \mapsto 1-\text { chan.rdy, ack } \mapsto \text { chan.ack }]
\end{aligned}
$$

This is a perfectly good definition of $\operatorname{Send}(d)$. However, I prefer a slightly different one. We can describe the value of chan' by saying that it is the same as the value of chan except that its val component equals $d$ and its $r d y$ component equals $1-c h a n . r d y$. In TLA ${ }^{+}$, we can write this value as

$$
[\text { chan EXCEPT }!. v a l=d,!. r d y=1-\text { chan.rdy }]
$$

Think of the! as standing for chan, the record being modified by the EXCEPT expression. In the value replacing !.rdy, the symbol @ stands for chan.rdy, so we can write this expression as:

$$
[\text { chan } \operatorname{EXCEPT} \text { !.val }=d,!. r d y=1-@]
$$

In general, for any record $r$, the expression

$$
\left[r \operatorname{EXCEPT}!\cdot c_{1}=e_{1}, \ldots,!\cdot c_{n}=e_{n}\right]
$$

is the record obtained from $r$ by replacing $r . c_{i}$ with $e_{i}$, for each $i$ in $1 \ldots n$. An @ in the expression $e_{i}$ stands for $r . c_{i}$. Using this notation, we define:

$$
\begin{aligned}
\operatorname{Send}(d) \triangleq & \wedge \text { chan.rdy }=\text { chan.ack } \\
& \wedge \text { chan }{ }^{\prime}=[\text { chan } \operatorname{EXCEPT}!. v a l=d,!. r d y=1-@]
\end{aligned}
$$

The definition of $R c v$ is straightforward. A value can be received when chan.rdy $\neq$ chan.ack, and receiving the value complements chan.ack:

$$
\begin{aligned}
R c v \triangleq & \wedge \text { chan.rdy } \neq \text { chan.ack } \\
& \wedge \text { chan }^{\prime}=[\text { chan EXCEPT }!. a c k=1-@]
\end{aligned}
$$

| EXTENDS Naturals <br> CONSTANT Data <br> VARIABLE chan <br> TypeInvariant $\triangleq$ chan $\in[$ val $:$ Data, rdy $:\{0,1\}$, ack $:\{0,1\}]$ |  |  |
| :---: | :---: | :---: |
|  |  |  |
|  |  |  |
|  |  |  |
| $\begin{aligned} \text { Init } \triangleq & \wedge \text { TypeInvariant } \\ & \wedge \text { chan.ack }=\text { chan.rdy } \\ \operatorname{Send}(d) \triangleq & \wedge \text { chan.rdy }=\text { chan.ack } \\ & \wedge \text { chan }{ }^{\prime}=[\text { chan } \text { EXCEPT }!. v a l=d,!. r d y=1-@] \\ \operatorname{Rcv} \triangleq & \wedge \text { chan.rdy } \neq \text { chan.ack } \\ & \wedge \text { chan }{ }^{\prime}=[\text { chan } \text { EXCEPT }!. a c k=1-@] \end{aligned}$ |  |  |
|  |  |  |
|  |  |  |
| $\begin{aligned} & \text { Next } \triangleq(\exists d \in \text { Data }: \operatorname{Send}(d)) \vee R c v \\ & \text { Spec } \triangleq \text { Init } \wedge \square[\text { Next }]_{\text {chan }} \end{aligned}$ |  |  |
|  |  |  |
| THEOREM Spec $\Rightarrow \square$ TypeInvariant |  |  |

Figure 3: Our second specification of an asynchronous interface.

The complete specification appears in Figure 3.
We have now written two different specifications of the asynchronous interface. They are two different mathematical representations of the same physical system. In module AsynchInterface, we represented the system with the three variables val, rdy, and ack. In module Channel, we used a single variable chan. Since these two representations are at the same level of abstraction, they should, in some sense, be equivalent. Section 6.8 explains one sense in which they're equivalent.

### 4.3 Types: A Reminder

As defined in Section 4.1, a variable $v$ has type $T$ in specification Spec iff $v \in T$ is an invariant of Spec. Thus, $h r$ has type $1 \ldots 12$ in the specification $H C$ of the hour clock. This assertion does not mean that the variable $h r$ can assume only values in the set $1 \ldots 12$. A state is an arbitrary assignment of values to variables, so there exist states in which the value of $h r$ is $\sqrt{-2}$. The assertion does mean that, in every behavior satisfying formula $H C$, the value of $h r$ is an element of $1 \ldots 12$.

If you are used to types in programming languages, it may seem strange that $\mathrm{TLA}^{+}$allows a variable to assume any value. Why not restrict our states to ones in which variables have the values of the right type? In other words, why not add a formal type system to TLA ${ }^{+}$? A complete answer would take us too far afield. The question is addressed further in Section 7.2. For now, remember that $\mathrm{TLA}^{+}$is an untyped language. Type correctness is just a name for a certain invariance property. Assigning the name TypeInvariant to a formula gives it no special status.

### 4.4 Definitions

Let's examine what a definition means. If $I d$ is a simple identifier like Init or Spec, then the definition $I d \triangleq \exp$ defines $I d$ to be synonymous with the expression exp. Replacing $I d$ by exp, or vice-versa, in any expression $e$ does not change the meaning of $e$. This replacement must
be done after the expression is parsed, not in the "raw input". For example, the definition $x \triangleq a+b$ makes $x * c$ equal to $(a+b) * c$, not to $a+b * c$, which equals $a+(b * c)$.

The definition of Send has the form $I d(p) \triangleq \exp$, where $I d$ and $p$ are identifiers. For any expression $e$, this defines $I d(e)$ to be the expression obtained by substituting $e$ for $p$ in exp. For example, the definition of Send in the Channel module defines $\operatorname{Send}(-5)$ to equal
$\wedge$ chan.rdy $=$ chan.ack
$\wedge$ chan $=[$ chan EXCEPT $!. v a l=-5,!. r d y=1-@]$
$\operatorname{Send}(e)$ is an expression, for any expression $e$. Thus, $\operatorname{Send}(-5) \wedge($ chan.ack $=1)$ is a formula. The identifier Send by itself is not an expression, and Send $\wedge($ chan.ack $=1)$ is not a grammatically well-formed string. It's nonsyntactic nonsense, like $a * b+$.

We say that Send is an operator that takes a single argument. In the obvious way, we can define operators that take more than one argument, the general form being:

$$
\begin{equation*}
\operatorname{Id}\left(p_{1}, \ldots, p_{n}\right) \triangleq \exp \tag{4}
\end{equation*}
$$

where the $p_{i}$ are distinct identifiers and exp is an expression. We can consider defined identifiers like Init and Spec to be operators that take no argument, but we generally use operator to mean an operator that takes one or more arguments.

I will use the term symbol to mean an identifier like Send or an operator symbol like + . Every symbol that is used in a specification must either be a built-in operator of TLA ${ }^{+}$ (like $\in$ ) or it must be declared or defined. Every symbol declaration or definition has a scope within which the symbol may be used. The scope of a variable or constant declaration, and of a definition, is the rest of the module. Thus, we can use Init in any expression that follows its definition in module Channel. The statement Extends Naturals extends the scope of symbols like + defined in the Naturals module to the Channel module.

The operator definition (4) implicitly includes a declarations of the identifiers $p_{1}, \ldots, p_{n}$ whose scope is the expression exp. An expression of the form

$$
\exists v \in S: \exp
$$

has a declaration of $v$ whose scope is the expression $\exp$. Thus the identifier $v$ has a meaning within the expression exp (but not within the expression $S$ ).

A symbol cannot be declared or defined if it already has a meaning. The expression

$$
\left(\exists v \in S: \exp _{1}\right) \wedge\left(\exists v \in T: \exp _{2}\right)
$$

is all right, because neither declaration of $v$ lies within the scope of the other. Similarly, the two declarations of the symbol $d$ in the Channel module (in the definition of Send and in the expression $\exists d$ in the definition of $N e x t$ ) have disjoint scopes. However, the expression

$$
\left(\exists v \in S:\left(\exp _{1} \wedge \exists v \in T: \exp _{2}\right)\right)
$$

is illegal because the declaration of $v$ in the second $\exists v$ lies inside the scope of the its declaration in the first $\exists v$. Although conventional mathematics and programming languages allow such redeclarations, TLA ${ }^{+}$forbids them because they can lead to confusion and errors.

### 4.5 Comments

Even simple specifications like the ones in modules AsynchInterface and Channel can be hard to understand from the mathematics alone. That's why I began with an intuitive explanation of the interface. That explanation made it easier for you to understand formula Spec in the module, which is the actual specification. Every specification should be accompanied by an informal prose explanation. The explanation may be in an accompanying document, or it may be included as comments in the specification.


This module specifies a digital clock that displays the current hour. It ignores real time, not specifying when the display can change.

## EXTENDS Naturals

VARIABLE $h r \quad$ Variable $h r$ represents the display.
HCini $\triangleq h r \in(1 \ldots 12) \quad$ Initially, $h r$ can have any value from 1 through 12.
$H C n x t$ This is a weird place for a comment. $\triangleq$
The value of $h r$ cycles from 1 through 12.
$h r^{\prime}=\mathrm{IF} h r \neq 12$ THEN $h r+1$ ELSE 1
$H C \triangleq H C i n i \wedge \square[H C n x t]_{h r}$
The complete spec. It permits the clock to stop.

THEOREM $H C \Rightarrow \square H C$ ini $\quad$ Type-correctness of the spec.


Figure 4: The hour clock specification with comments.

Figure 4 shows how the hour clock's specification in module HourClock might be explained by comments. In the typeset version, comments are distinguished from the specification itself by the use of a different font. As shown in the figure, TLA ${ }^{+}$provides two way of writing comments in the ASCII version. A comments may appear anywhere enclosed between ( $*$ and *). The text of the comment itself may not contain $*$ ), so these comments can't be nested. An end-of-line comment is preceded by $\backslash *$.

A comment almost always appears on a line by itself or at the end of a line. I put a comment between HCnxt and $\triangleq$ just to show that it can be done.

To save space, I will write few comments in the example specifications. But specifications should have lots of comments. Even if there is an accompanying document describing the system, comments are needed to help the reader understand how the specification formalizes that description.

Comments can help solve a problem posed by the logical structure of a specification. A symbol has to be declared or defined before it can be used. In module Channel, the definition of Spec has to follow the definition of Next, which has to follow the definitions of Send and Rcv. But it's usually easiest to understand a top-down description of a system. We would probably first want to read the declarations of Data and chan, then the definition of Spec, then the definitions of Init and Next, and then the definitions of Send and Rcv. In other words, we want to read the specification more or less from bottom to top. This is easy enough to do for a module as short as Channel; it's inconvenient for longer specifications. We can use comments to guide the reader through a longer specification. For example, we could precede the definition of Send in the Channel module with the comment:

Actions Send and Rcv below are the disjuncts of the next-state action Next.
The module structure also allows us to choose the order in which a specification is read. For example, we can rewrite the hour-clock specification by splitting the HourClock module into three separate modules:

HCVar A module that declares the variable $h r$.
HCActions A module that Extends modules Naturals and HCVar and defines HCini and HCnxt.

HCSpec A module that ExTENDS module HCActions, defines formula $H C$, and asserts the type-correctness theorem.

The Extends relation implies a logical ordering of the modules: HCVar precedes HCActions, which precedes HCSpec. But the modules don't have to be read in that order. The reader can be told to read HCVar first, then HCSpec, and finally HCActions. The instance construct introduced below in Section 5 provides another tool for modularizing specifications.

Splitting a tiny specification like HourClock in this way would be ludicrous. But the proper splitting of modules can help make a large specification easier to read. When writing a specification, you should decide in what order it should be read. You can then design the module structure to permit reading it in that order, when each individual module is read from beginning to end. Finally, you should ensure that the comments within each module make sense when the different modules are read in the appropriate order.

## 5 A FIFO

Our next example is a FIFO buffer, a device with which a sender process transmits a sequence of values to a receiver. The sender and receiver use two channels, in and out, to communicate with the buffer:


Values are sent over in and out using the asynchronous protocol specified by the Channel module of Figure 3. The system's specification will allow behaviors with four kinds of nonstuttering steps: Send and Rcv actions on both the in channel and the out channel.

### 5.1 The Inner Specification

The specification of the FIFO first extends modules Naturals and Sequences. The Sequences module defines operations on finite sequences. We represent a finite sequence as a tuple, so the sequence of three numbers $3,2,1$ is the triple $\langle 3,2,1\rangle$. The sequences module defines the following operators on sequences.
$\operatorname{Seq}(S) \quad$ The set of all sequences of elements of the set $S$. For example, $\langle 3,7\rangle$ is an element of $S e q(N a t)$.

Head(s) The first element of sequence $s$. For example, Head $(\langle 3,7\rangle)$ equals 3.
$\operatorname{Tail}(s)$ The tail of sequence $s$ (all but the head of $s$ ). For example, $\operatorname{Tail}(\langle 3,7\rangle)$ equals $\langle 7\rangle$.
$\operatorname{Append}(s, e)$ The sequence obtained by appending element $e$ to the tail of sequence $s$. For example, Append $(\langle 3,7\rangle, 3)$ equals $\langle 3,7,3\rangle$.
$s \circ t \quad$ The sequence obtained by concatenating the sequences $s$ and $t$. For example, $\langle 3,7\rangle \circ\langle 3\rangle$ equals $\langle 3,7,3\rangle$. (We type $\circ$ in ASCII as $\backslash o$.)
$\operatorname{Len}(s) \quad$ The length of sequence $s$. For example, $\operatorname{Len}(\langle 3,7\rangle)$ equals 2.
The FIFO's specification continues by declaring the constant Message, which represents the set of all messages that can be sent. ${ }^{7}$ It then declares the variables. There are three variables: in and out, representing the channels, and a third variable $q$ that represents the queue of buffered messages. The value of $q$ is the sequence of messages that have been sent by the sender but not yet received by the receiver. (Section 5.3 has more to say about this additional variable $q$.)

We want to use the definitions in the Channel module to specify operations on the channels in and out. This requires two instances of that module - one in which the variable chan of the Channel module is replaced with the variable in of our current module, and the other in which chan is replaced with out. In both instances, the constant Data of the Channel module is replaced with Message. We obtain the first of these instances with the statement:

$$
\text { InChan } \triangleq \text { instance Channel with Data } \leftarrow \text { Message, chan } \leftarrow \text { in }
$$

For every symbol $\sigma$ defined in module Channel, this defines InChan! $\sigma$ to have the same meaning in the current module as $\sigma$ had in module Channel, except with Message substituted for

[^3]Data and in substituted for chan. For example, this statement defines InChan! TypeInvariant to equal

$$
\text { in } \in[\text { val }: \text { Message, rdy }:\{0,1\} \text {, ack }:\{0,1\}]
$$

We introduce our second instance of the Channel module with the analogous statement:

$$
\text { OutChan } \triangleq \text { instance Channel With Data } \leftarrow \text { Message, chan } \leftarrow \text { out }
$$

The initial states of the in and out channels are specified by InChan! Init and OutChan! Init. Initially, no messages have been sent or received, so $q$ should equal the empty sequence. The empty sequence is the zero-tuple (there's only one, and it's written $\rangle$ ), so we define the initial predicate to be:

$$
\begin{aligned}
\text { Init } \triangleq & \wedge \text { InChan!Init } \\
& \wedge \text { OutChan!Init } \\
& \wedge q=\langle \rangle
\end{aligned}
$$

We next define the type invariant. The type invariants for in and out come from the Channel module, and the type of $q$ is the set of finite sequences of messages. The type invariant for the FIFO specification is therefore:

$$
\begin{aligned}
\text { TypeInvariant } \triangleq & \wedge \text { InChan! TypeInvariant } \\
& \wedge \text { OutChan! TypeInvariant } \\
& \wedge q \in \text { Seq }(\text { Message })
\end{aligned}
$$

The four kinds of nonstuttering steps allowed by the next-state action are described by four actions:
$S S e n d(m s g)$ The sender sends message $m s g$ on the in channel.
BufRcv The buffer receives the message from the in channel and appends it to the tail of $q$.

BufSend The buffer removes the message from the head of $q$ and sends it on channel out.

RRcv The receiver receives the message from the out channel.
The definitions of these actions, along with the rest of the specification, are in module FIFO of Figure 5.

### 5.2 Instantiation Examined

### 5.2.1 Parameterized Instantiation

The FIFO specification uses two instances of module Channel-one with in substituted for chan and the other with out substituted for chan. We could instead use a single parametrized instance by putting the following statement in module InnerFIFO:

$$
\text { Chan }(\text { ch }) \triangleq \text { instance Channel with Data } \leftarrow \text { Message, chan } \leftarrow c h
$$

For any symbol $\sigma$ defined in module Channel and any expression exp, this defines Chan (exp)! $\sigma$ to equal formula $\sigma$ with Message substituted for Data and exp substituted for chan. The Rcv action on channel in could then be written Chan(in)!Rcv, and the Send (msg) action on channel out could be written Chan(out)! Send (msg).

```
    ExTENDS Naturals,Sequences
    CONSTANT Message
    VARIABLES in,out, q
    InChan }\triangleq INSTANCE Channel WITH Data \leftarrow Message, chan \leftarrow in
OutChan \triangleq INSTANCE Channel WITH Data \leftarrow Message, chan }\leftarrow\mathrm{ out
    Init }\triangleq\wedge InChan!Init
            ^OutChan!Init
            \wedge q= <\rangle
    TypeInvariant \triangleq ^ InChan!TypeInvariant
                                    ^OutChan!TypeInvariant
                                    \wedge q}\in\operatorname{Seq(Message)
    SSend (msg) \triangleq ^ InChan!Send(msg) Send msg on channel in.
                            ^ UNCHANGED \langleout, q\rangle
    BufRcv \triangleq ^ InChan!Rcv Receive message from channel in
        \wedge q}=\operatorname{Append}(q,in.val)\quad\mathrm{ and append it to tail of q.
        ^ UNCHANGED out
    BufSend \triangleq Eq|\> Enabled only if q is nonempty.
        ^OutChan!Send(Head (q)) Send Head(q) on channel out
        \wedge q ^ { \prime } = \operatorname { T a i l } ( q ) \quad \text { and remove it from } q \text { .}
        ^ UNCHANGED in
    RRcv}\triangleq^\mathrm{ OutChan!Rcv Receive message from channel out.
        ^ UNCHANGED \langlein, q\rangle
    Next\triangleq \\existsmsg}\in\mathrm{ Message : SSend (msg)
        \vee BufRcv
        \vee BufSend
        \vee RRcv
    Spec}\triangleq\mp@code{Init }\wedge\square[Next\mp@subsup{]}{\langlein,out,q\rangle}{
    THEOREM Spec }=>\square\mathrm{ TypeInvariant
```

Figure 5: The specification of a FIFO, with the internal variable $q$ visible.

### 5.2.2 Implicit Substitutions

The use of Message as the name for the set of transmitted values in the FIFO specification is a bit strange, since we had just used the name Data for the analogous set in the asynchronous channel specifications. Suppose we had used Data in place of Message as the constant parameter of module InnerFIFO. The first instantiation statement would then have been

$$
\text { InChan } \triangleq \text { instance Channel with Data } \leftarrow \text { Data, chan } \leftarrow \text { in }
$$

The substitution Data $\leftarrow$ Data indicates that the constant parameter Data of the instantiated module Channel is replaced with the expression Data of the current module. TLA ${ }^{+}$allows us to drop any substitution of the form $\sigma \leftarrow \sigma$, for a symbol $\sigma$. So, the statement above can be written as

## InChan $\triangleq$ INSTANCE Channel WITH chan $\leftarrow$ in

We know there is an implied Data $\leftarrow$ Data substitution because an instance statement must have a substitution for every parameter (CONSTANT or Variable) of the instantiated module. If some parameter $p$ has no explicit substitution, then there is an implicit substitution $p \leftarrow p$. This means that the Instance statement must lie within the scope of a declaration or definition of the symbol $p$.

It is quite common to instantiate a module with this kind of implicit substitution. Often, every parameter has an implicit substitution, in which case the list of explicit substitutions is empty. The with is then omitted.

### 5.2.3 Instantiation Without Renaming

So far, all the instantiations we've used have been with renaming. For example, the first instantiation of module Channel renames the defined symbol Send as InChan!Send. This kind of renaming is necessary if we are using multiple instances of the module, or a single parameterized instance. The two instances InChan! Init and OutChan! Init of Init in module InnerFIFO are different formulas, so they need different names.

Sometimes we need only a single instance of a module. For example, suppose we are specifying a system with only a single asynchronous channel. We then need only one instance of Channel, so we don't have to rename the instantiated symbols. In that case, we can write something like
instance Channel with Data $\leftarrow D$, chan $\leftarrow x$
This instantiates Channel with no renaming, but with substitution. Thus, it defines Rcv to be the formula of the same name from the Channel module, except with $D$ substituted for Data and $x$ substituted for chan. Of course, the expressions substituted for an instantiated module's parameters must be defined. So, this instance statement must be within the scope of the definitions or declarations of $D$ and $x$.

### 5.3 Hiding the Queue

Module InnerFIFO of Figure 5 defines Spec to be Init $\wedge \square[$ Next $]$... , the sort of formula we've become accustomed to as a system specification. However, formula Spec describes the value of variable $q$, as well as of the variables in and out. My picture of the FIFO system shows only channels in and out; it doesn't show anything inside the boxes. A specification of the FIFO should describe only the values sent and received on the channels. The variable $q$, which represents what's going on inside the box labeled Buffer, is used to specify what values are sent and received. In the final specification, it should be hidden.

In TLA, we hide a variable with the existential quantifier $\exists$ of temporal logic. The formula $\exists x: F$ is true of a behavior iff there exists some sequence of values - one in each state of the behavior - that can be assigned to the variable $x$ that will make formula $F$ true. (The meaning of $\boldsymbol{\exists}$ is defined more precisely in Section 8.6.)

The obvious thing to do now is to define the FIFO specification to be the formula $\exists q$ : Spec. However, we can't put this definition in module InnerFIFO because $q$ is already declared there, and a formula $\exists q: \ldots$ would redeclare it. Instead, we use a new module with a parametrized instantiation of the InnerFIFO module (see Section 5.2.1 above):

| $\square$ Constant Message |
| :--- |
| VARIABLES in, out |
| Inner $(q) \triangleq$ INSTANCE $\operatorname{InnerFIFO}$ |
| Spec $\triangleq \exists q: \operatorname{Inner}(q)!$ Spec |

Observe that the instance statement is an abbreviation for

$$
\begin{aligned}
\operatorname{Inner}(q) \triangleq & \text { instance } \operatorname{InnerFIFO} \\
& \quad \text { WITH } q \leftarrow q, \text { in } \leftarrow \text { in, out } \leftarrow \text { out }, \text { Message } \leftarrow \text { Message }
\end{aligned}
$$

The variable parameter $q$ of module InnerFIFO is instantiated with the parameter $q$ of the definition of Inner. The other parameters of the InnerFIFO module are instantiated with the parameters of module FIFO.

### 5.4 A Bounded FIFO

We have specified an unbounded FIFO-a buffer that can hold an unbounded number of messages. Any real system has a finite amount of resources, so it can contain only a bounded number of in-transit messages. In many situations, we wish to abstract away the bound on resources and describe a system in terms of unbounded FIFOs. In other situations, we may care about that bound. We then want to strengthen our specification by placing a bound $N$ on the number of outstanding messages.

A specification of a bounded FIFO differs from our specification of the unbounded FIFO only in that action BufRcv should be enabled only when there are fewer than $N$ messages in the buffer - that is, only when $\operatorname{Len}(q)$ is less than $N$. It would be easy to write a complete new specification of a bounded FIFO by copying module InnerFIFO and just adding the conjunct $\operatorname{Len}(q)<N$ to the definition of BufRcv. But let's use module InnerFIFO as it is, rather than copying it.

The next-state action BNext for the bounded FIFO is the same as the FIFO's next-state action Next except that it allows a BufRcv step only if $\operatorname{Len}(q)$ is less than $N$. In other words, BNext should allow a step only if (i) it's a Next step and (ii) if it's a BufRcv step, then $\operatorname{Len}(q)<N$ is true in the first state. In other words, BNext should equal

$$
\text { Next } \wedge(\text { BufRcv } \Rightarrow(\operatorname{Len}(q)<N))
$$

Module BoundedFIFO in Figure 6 contains the specification. It introduces the new constant parameter $N$. It also contains the statement

$$
\operatorname{ASSUME}(N \in N a t) \wedge(N>0)
$$

which asserts the assumption that $N$ is a positive natural number. An assumption has no effect on any definitions made in the module. However, it may be taken as a hypothesis when proving any theorems asserted in the module. In other words, a module asserts that

```
EXTENDS Naturals
VARIABLES in, out
CONSTANT Message, \(N\)
ASSUME \((N \in N a t) \wedge(N>0)\)
\(\operatorname{Inner}(q) \triangleq\) INSTANCE InnerFIFO
\(\operatorname{BNext}(q) \triangleq \wedge \operatorname{Inner}(q)!N e x t\)
    \(\wedge \operatorname{Inner}(q)!B u f R c v \Rightarrow(\operatorname{Len}(q)<N)\)
Spec \(\triangleq \exists q: \operatorname{Inner}(q)!\operatorname{Init} \wedge \square[B N e x t(q)]_{\langle i n, \text { out }, q\rangle}\)
```

Figure 6: Specification of a FIFO buffer of length $N$.
its assumptions imply its theorems. It's a good idea to assert this kind of simple assumption about constants.

An assume statement should only be used to assert assumptions about constants. The formula being assumed should not contain any variables. It might be tempting to assert type declarations as assumptions - for example, to add to module InnerFIFO the assumption $q \in \operatorname{Seq}($ Message $)$. However, that would be wrong because it asserts that, in any state, $q$ is a sequence of messages. As we observed in Section 4.3, a state is a completely arbitrary assignment of values to variables, so there are states in which $q$ has the value $\sqrt{-17}$. Assuming that such a state doesn't exist would contradict the laws of TLA.

You may wonder why module BoundedFIFO asserts that $N$ is a positive natural, but doesn't assert that Message is a set. Similarly, why didn't we have to specify that the constant parameter Data in our asynchronous interface specifications is a set? The answer is that, in TLA ${ }^{+}$, every value is a set. ${ }^{8}$ A value like the number 3, which we don't think of as a set, is formally a set. We just don't know what its elements are. The formula $2 \in 3$ is a perfectly reasonable one, but TLA ${ }^{+}$does not specify whether it's true or false. So, we don't have to assert that Message is a set because we know that it is one.

Although Message is automatically a set, it isn't necessarily a finite set. For example, Message could be instantiated with the set Nat of natural numbers. If you want to assume that a constant parameter is a finite set, then you need to state this as an assumption. (Writing a formula that asserts that a set is finite is a nice exercise in TLA ${ }^{+}$.) However, most specifications make perfect sense for infinite sets of messages or processors, so there is no reason to require these sets to be finite.

### 5.5 What We're Specifying

I wrote above, at the beginning of this section, that we were going to specify a FIFO buffer. Formula Spec of the FIFO module actually specifies a set of behaviors, each representing a sequence of sending and receiving operations on the channels in and out. The sending operations on in are performed by the sender, and the receiving operations on out are performed by the receiver. The sender and receiver are not part of the FIFO buffer; they form its environment.

Our specification describes a system consisting of the FIFO buffer and its environment. The behaviors satisfying formula Spec of module FIFO represent those histories of the universe in which both the system and its environment behave correctly. It's often helpful in

[^4]understanding a specification to indicate explicitly which steps are system steps and which are environment steps. We can do this by defining the next-state action to be
$$
\text { Next } \triangleq \text { SysNext } \vee \text { EnvNext }
$$
where SysNext describes system steps and EnvNext describes environment steps. For the FIFO, we have
\[

$$
\begin{aligned}
& \text { SySNext } \triangleq \text { BufRcv } \vee \text { BufSend } \\
& \text { EnvNext } \triangleq(\exists \text { msg } \in \text { Message }: \text { SSend }(m s g)) \vee R R c v
\end{aligned}
$$
\]

While suggestive, this way of defining the next-state action has no formal significance. The specification Spec equals Init $\wedge \square[N e x t]$... changing the way we structure the definition of Next doesn't change its meaning. If a behavior fails to satisfy Spec, nothing tells us if the system or its environment is to blame.

A formula like Spec, which describes the correct behavior of both the system and its environment, is called a closed-system or complete-system specification. An open-system specification is one that describes only the correct behavior of the system. A behavior satisfies an open-system specification if it represents a history in which either the system operates correctly, or it failed to operate correctly only because its environment did something wrong. How to write open-system specifications is described elsewhere [1].

Open-system specifications are philosophically more satisfying. However, closed-system specifications are a little bit easier to write, and the mathematics underlying them is simpler. So, we almost always write closed-system specifications. It's usually quite easy to turn a closed-system specification into an open-system specification. But in practice, there's little reason to do so.

## 6 A Caching Memory

A memory system consists of a set of processors connected to a memory by some abstract interface, which we label memInt.


In this section we specify what the memory is supposed to do, then we specify a particular implementation of the memory using caches. We begin by specifying the memory interface, which is common to both specifications.

### 6.1 The Memory Interface

The asynchronous interface described in Section 4 uses a handshake protocol. Receipt of a data value must be acknowledged before the next data value can be sent. In the memory interface, we abstract away this kind of detail and represent both the sending of a data value and its receipt as a single step. We call it a Send step if a processor is sending the value to the memory; it's a Reply step if the memory is sending to a processor. Processors do not send values to one another, and the memory sends to only one processor at a time.

We represent the state of the memory interface by the value of the variable memInt. A Send step changes memInt in some way, but we don't want to specify exactly how. The way to leave something unspecified in a specification is to make it a parameter. For example, in the
bounded FIFO of Section 5.4, we left the size of the buffer unspecified by making it a parameter $N$. We'd therefore like to declare a parameter Send so that $\operatorname{Send}(p, d)$ describes how memInt is changed by a step that represents processor $p$ sending data value $d$ to the memory. However, TLA ${ }^{+}$provides only CONSTANT and VARIABLE parameters, not action parameters. ${ }^{9}$ So, we declare Send to be a constant operator and write Send ( $p, d$, memInt, memInt ${ }^{\prime}$ ) instead of $\operatorname{Send}(p, d)$.

In TLA ${ }^{+}$, we declare Send to be a constant operator that takes four arguments by writing

$$
\text { Constant } \operatorname{Send}\left(\__{-},,_{-},{ }_{-}\right)
$$

This means that $\operatorname{Send}(p, d, m i O l d, m i N e w)$ is an expression, for any expressions $p, d$, miOld, and miNew, but it says nothing about what the value of that expression is. We want it to be a Boolean value that is true iff a step in which memInt equals miOld in the first state and miNew in the second state represents the sending by $p$ of value $d$ to the memory. ${ }^{10}$ We can assert that the value is a Boolean by the assumption:

$$
\text { ASSUME } \forall p, d, \text { miOld, miNew }: \operatorname{Send}(p, d, \text { miOld, miNew }) \in \text { BOOLEAN }
$$

This asserts that the formula

$$
\operatorname{Send}(p, d, \text { miOld }, \text { miNew }) \in \text { BOOLEAN }
$$

is true for all values of $p, d$, miOld, and miNew. The built-in symbol boolean denotes the set \{true, false\}, whose elements are the two Boolean values true and false.

This ASSUME statement asserts formally that the value of

$$
\operatorname{Send}(p, d, m i O l d, m i N e w)
$$

is a Boolean. But the only way to assert formally what that value signifies would be to say what it actually equals - that is, to define Send rather than making it a parameter. We don't want to do that, so we just state informally what the value means. This statement is part of the intrinsically informal description of the relation between our mathematical abstraction and a physical memory system.

To allow the reader to understand the specification, we have to describe informally what Send means. The ASSUME statement asserting that $\operatorname{Send}(\ldots)$ is a Boolean is then superfluous as an explanation. However, it could help tools understand the specification, so it's a good idea to include it anyway.

A specification that uses the memory interface can use the operators Send and Reply to specify how the variable memInt changes. The specification must also describe memInt's initial value. We therefore declare a constant parameter InitMemInt that is the set of possible initial values of memInt.

We also introduce three constant parameters that are needed to describe the interface:
Proc The set of processor identifiers. (We usually shorten processor identifier to processor when referring to an element of Proc.)

Adr The set of memory addresses.
Val The set of possible memory values that can be assigned to an address.

[^5]VARIABLE memInt
CONSTANTS $\operatorname{Send}\left(\mathbf{\_}^{\mathbf{-}}, \mathbf{-}, \mathbf{-}, \mathbf{-}\right), \mathrm{A} \operatorname{Send}(p, d$, memInt, memInt') step represents processor $p$ sending value $d$ to the memory.
 memory sending value $d$ to processor $p$.
InitMemInt, The set of possible initial values of memInt.
Proc, $\quad$ The set of of processor identifiers.
$A d r, \quad$ The set of memory addresses.
Val The set memory values.

| ASSUME $\forall p, d, m i O l d, m i N e w:$ | $\wedge \operatorname{Send}(p, d, m i O l d, m i N e w) \in$ BOOLEAN |
| ---: | :--- |
|  | $\wedge \operatorname{Reply}(p, d, m i O l d, m i N e w) \in$ BOOLEAN |


| $M R e q \triangleq$ | $[o p:\{" \mathrm{Rd} "\}, a d r: A d r] \cup[o p:\{$ "Wr" $\}, a d r: A d r, v a l: V a l]$ |
| ---: | :--- |
|  | The set of all requests; a read specifies an address, a write specifies an address and |
|  | a value. |

Figure 7: The Specification of a Memory Interface

Finally, we define the values that the processors and memory send to one another over the interface. A processor sends a request to the memory. We represent a request as a record with an op field that specifies the type of request and additional fields that specify its arguments. Our simple memory allows just read and write requests. A read request has op field "Rd" and an $a d r$ field specifying the address to be read. The set of all read requests is therefore the set

$$
[o p:\{" \mathrm{Rd} "\}, a d r: A d r]
$$

of all records whose op field equals "Rd" (is an element of the set $\{$ " $R d$ " $\}$ whose only element is the string " Rd ") and whose $a d r$ field is an element of $A d r$. A write request must specify the address to be written and the value to write. It is represented by a record with op field equal to "Wr", and with $a d r$ and val fields specifying the address and value. We define MReq, the set of all requests, to equal the union of these two sets. (Set operations, including union, are described in Section 2.2.)

The memory responds to a read request with the memory value it read. It must also respond to a write request; and it seems nice to let the response be different from the response to any read request. We therefore require the memory to respond to a write request by returning a value No Val that is different from any memory value. We could declare NoVal to be a constant parameter and add the assumption NoVal $\notin$ Val. (The symbol $\notin$ is typed in ASCII as \notin.) But it's best, when possible, to avoid introducing parameters. Instead, we define NoVal by:

$$
\text { NoVal } \triangleq \text { CHOOSE } v: v \notin V a l
$$

The expression Choose $x: F$ equals an arbitrarily chosen value $x$ that satisfies the formula $F$. (If no such $x$ exists, the expression has a completely arbitrary value.) This statement defines NoVal to be some value that is not an element of Val. We have no idea what the value of NoVal is; we just know what it isn't—namely, that it isn't an element of Val. The choose operator is discussed in Section 7.6.

The complete memory interface specification is module MemoryInterface in Figure 7.

### 6.2 Functions

A memory assigns values to addresses. The state of the memory is therefore an assignment of elements of Val (memory values) to elements of $A d r$ (memory addresses). In a programming language, such an assignment is called an array of type Val indexed by $A d r$. In mathematics, it's called a function from $A d r$ to Val. Before writing the memory specification, let's look at the mathematics of functions, and how it is described in TLA ${ }^{+}$.

A function $f$ has a domain, written domain $f$, and it assigns to each element $x$ of its domain the value $f[x]$. (Mathematicians write this as $f(x)$, but TLA ${ }^{+}$uses the array notation of programming languages, with square brackets.) Two functions $f$ and $g$ are equal iff they have the same domain and $f[x]=g[x]$ for all $x$ in their domain.

The range of a function $f$ is the set of all elements of the form $f[x]$ with $x$ in domain $f$. For any sets $S$ and $T$, the set of all functions whose domain equals $S$ and whose range is any subset of $T$ is written $[S \rightarrow T$ ].

Ordinary mathematics does not have a convenient notation for writing an expression whose value is a function. TLA ${ }^{+}$defines $[x \in S \mapsto e]$ to be the function $f$ with domain $S$ such that $f[x]=e$ for every $x \in S .{ }^{11}$ For example,

$$
\text { succ } \triangleq[n \in N a t \mapsto n+1]
$$

defines succ to be the successor function on the natural numbers - the function with domain Nat such that $\operatorname{succ}[n]=n+1$ for all $n \in$ Nat.

A record is a function whose domain is a finite set of strings. For example, a record with val, ack, and $r d y$ fields is a function whose domain is the set $\{$ "val", "ack", "rdy"\} consisting of the three strings "val", "ack", and "rdy". The expression r.ack, the ack field of a record $r$, is an abbreviation for $r$ ["ack"]. The record

$$
[v a l \mapsto 42, \text { ack } \mapsto 1, r d y \mapsto 0]
$$

can be written

$$
\begin{aligned}
& {[i \in\{\text { "val", "ack", "rdy" }\} \mapsto} \\
& \text { IF } i=\text { "val" THEN } 42 \text { ELSE IF } i=\text { "ack" THEN } 1 \text { ELSE } 0]
\end{aligned}
$$

The EXCEPT construct for records, explained in Section 4.2, is a special case of a general EXCEPT construct for functions, where !.c is an abbreviation for ! ["c"]. For any function $f$, the expression $[f$ EXCEPT $![c]=e]$ is the function $\hat{f}$ that is the same as $f$ except with $\hat{f}[c]=e$. This function can also be written:

$$
[x \in \operatorname{DOMAIN} f \mapsto \operatorname{IF} \quad x=c \text { THEN } e \text { ELSE } f[x]]
$$

assuming that the symbol $x$ does not occur in any of the expressions $f, c$, and $e$. For example, [succ EXCEPT ! [42] $=86$ ] is the function $g$ that is the same as succ except that $g[42$ ] equals 86 instead of 43.

As in the EXCEPT construct for records, the expression $e$ in
$[f$ EXCEPT $![c]=e]$
can contain the symbol @, where it means $f[c]$. For example,

$$
[\text { succ } \operatorname{EXCEPT}![42]=2 * @]=[\text { succ } \operatorname{EXCEPT}![42]=2 * \operatorname{succ}[42]]
$$

[^6]In general,

$$
\left[f \operatorname{EXCEPT}!\left[c_{1}\right]=e_{1}, \ldots,!\left[c_{n}\right]=e_{n}\right]
$$

is the function $\hat{f}$ that is the same as $f$ except with $\hat{f}\left[c_{i}\right]=e_{i}$ for each $i$. More precisely, this expression equals

$$
\left[\ldots\left[\left[f \text { EXCEPT }!\left[c_{1}\right]=e_{1}\right] \operatorname{EXCEPT}!\left[c_{2}\right]=e_{2}\right] \ldots \operatorname{EXCEPT}!\left[c_{n}\right]=e_{n}\right]
$$

Functions correspond to the arrays of programming languages. The domain of a function corresponds to the index set of an array. The function $[f$ EXCEPT $![c]=e]$ corresponds to the array obtained from $f$ by assigning $e$ to $f[c]$. A function whose range is a set of functions corresponds to an array of arrays. TLA ${ }^{+}$defines $[f$ EXCEPT $![c][d]=e]$ to be the function corresponding to the array obtained by assigning $e$ to $f[c][d]$. It can be written as

$$
[f \text { EXCEPT }![c]=[@ \operatorname{EXCEPT}![d]=e]]
$$

The generalization to $\left[f\right.$ EXCEPT $\left.!\left[c_{1}\right] \ldots\left[c_{n}\right]=e\right]$ for any $n$ should be obvious. Since a record is a function, this notation can be used for records as well. TLA ${ }^{+}$uniformly maintains the notation that $\sigma . c$ is an abbreviation for $\sigma[$ " $c$ "]. For example, this implies:

$$
[f \text { EXCEPT }![c] . d=e]=[f \text { EXCEPT }![c]=[@ \operatorname{EXCEPT}!. d=e]]
$$

The TLA ${ }^{+}$definition of records as functions makes it possible to manipulate them in ways that have no counterparts in programming languages. For example, we can define an operator $R$ such that $R(r, s)$ is the record obtained from $r$ by replacing the value of each field $c$ that is also a field of the record $s$ with s.c. In other words, for every field $c$ of $r$, if $c$ is a field of $s$ then $R(r, s) . c=s . c$; otherwise $R(r, s) . c=r . c$. The definition is:

$$
R(r, s) \triangleq[c \in \text { DOMAIN } r \mapsto \operatorname{IF} c \in \text { DOMAIN } s \text { THEN } s[c] \text { ELSE } r[c]]
$$

All the TLA ${ }^{+}$function constructs have generalizations for functions with multiple arguments. We describe only functions of a single argument here.

### 6.3 A Linearizable Memory Specification

We specify a very simple memory system in which a processor $p$ issues a memory request and then waits for a response before issuing the next request. In our specification, the request is executed by accessing (reading or modifying) a variable mem, which represents the current state of the memory. Because the memory can receive requests from other processors before responding to processor $p$, it matters when mem is accessed. We let the access of mem occur any time between the request and the response. This specifies what is called a linearizable memory [4]. (The more commonly specified sequentially consistent memory [9] allows the access of mem to occur at any time.)

In addition to mem, the specification has the internal variables $c t l$ and $b u f$, where $c t l[p]$ describes the status of processor $p$ 's request and buf $[p]$ contains either the request or the response. Consider the request req that equals

$$
[o p \mapsto " \mathrm{Wr} ", a d r \mapsto a, v a l \mapsto v]
$$

It is a request to write $v$ to memory address $a$, and it generates the response NoVal. The processing of this request is represented by the following three steps:

$$
\begin{aligned}
& {\left[\begin{array}{ll}
\operatorname{ctl}[p] & =\text { "rdy" } \\
b u f[p] & =\cdots \\
\operatorname{mem}[a] & =\cdots
\end{array}\right] \stackrel{\operatorname{Req}(p)}{\longrightarrow}\left[\begin{array}{ll}
\operatorname{ctl}[p] & =\text { "busy" } \\
b u f[p] & =r e q \\
\operatorname{mem}[a] & =\cdots
\end{array}\right] \xrightarrow{\operatorname{Do(p)}}\left[\begin{array}{ll}
c t l[p] & =\text { "done" } \\
\operatorname{buf}[p] & =\text { NoVal } \\
\operatorname{mem}[a] & =v
\end{array}\right]} \\
& \xrightarrow{R s p(p)}\left[\begin{array}{ll}
\operatorname{ctl}[p] & =\text { "rdy" } \\
b u f[p] & =\text { NoVal } \\
\operatorname{mem}[a] & =v
\end{array}\right]
\end{aligned}
$$

A $\operatorname{Req}(p)$ step represents the issuing of a request by processor $p$. It is enabled when $\operatorname{ctl}[p]=$ "rdy"; it sets $c t l[p]$ to "busy" and sets buf $[p]$ to the request. A $D o(p)$ step represents the memory access; it is enabled when $\operatorname{ctl}[p]=$ "busy" and it sets $\operatorname{ctl}[p]$ to "done" and buf $[p]$ to the response. A $R s p(p)$ step represents the memory's response to $p$; it is enabled when $\operatorname{ctl}[p]=$ "done" and it sets $\operatorname{ctl}[p]$ to "rdy".

Writing the specification is a straightforward exercise in representing these changes to the variables in TLA ${ }^{+}$notation. The internal specification, with mem, ctl, and buf visible (free variables), appears in module InternalMemory in Figure 8. The memory specification, which hides the three internal variables, is module Memory in Figure 9.

### 6.4 Tuples as Functions

Before writing our caching memory specification, let's take a closer look at tuples. Recall that $\langle a, b, c\rangle$ is the 3 -tuple with components $a, b$, and $c$. In TLA ${ }^{+}$, this 3 -tuple is actually the function with domain $\{1,2,3\}$ that maps 1 to $a, 2$ to $b$, and 3 to $c$. Thus, $\langle a, b, c\rangle[2]$ equals $b$.

TLA ${ }^{+}$provides the Cartesian product operator $\times$of ordinary mathematics, where $A \times B \times C$ is the set of all 3-tuples $\langle a, b, c\rangle$ such that $a \in A, b \in B$, and $c \in C$. Note that $A \times B \times C$ is different from $A \times(B \times C)$, which is the set of pairs $\langle a, p\rangle$ with $a$ in $A$ and $p$ in the set of pairs $B \times C$.

The Sequences module defines finite sequences to be tuples. Hence, a sequence of length $n$ is a function with domain $1 \ldots n$. In fact, $s$ is a sequence iff it equals $[i \in 1 \ldots \operatorname{Len}(s) \mapsto s[i]]$. Below are a few operator definitions from the Sequences module. (The meanings of the operators are described in Section 5.1.)

$$
\begin{aligned}
\operatorname{Head}(s) \triangleq & s[1] \\
\operatorname{Tail}(s) \triangleq & \triangleq[i \in 1 \ldots(\operatorname{Len}(s)-1) \mapsto s[i+1]] \\
s \circ t & \triangleq \quad[i \in 1 \ldots(\operatorname{Len}(s)+\operatorname{Len}(t)) \mapsto \\
& \quad \operatorname{IF} \quad i \leq \operatorname{Len}(s) \text { THEN } s[i] \operatorname{ELSE} t[i-\operatorname{Len}(s)]]
\end{aligned}
$$

### 6.5 Recursive Function Definitions

We need one more tool to write the caching memory specification: recursive function definitions. Recursively defined functions are familiar to programmers. The classic example is the factorial function, which I'll call fact. It's usually defined by writing:

$$
\text { fact }[n]=\mathrm{IF} n=0 \text { THEN } 1 \text { ELSE } n * \text { fact }[n-1]
$$

for all $n \in$ Nat. The TLA ${ }^{+}$notation for writing functions suggests trying to define fact by

$$
\text { fact } \triangleq[n \in N a t \mapsto \operatorname{IF} n=0 \text { THEN } 1 \text { ELSE } n * \text { fact }[n-1]]
$$

This definition is illegal because the occurrence of fact to the right of the $\triangleq$ is undefined-fact is defined only after its definition.
$\mathrm{TLA}^{+}$does allow the apparent circularity of recursive function definitions. We can define the factorial function fact by:

$$
\text { fact }[n \in N a t] \triangleq \text { IF } n=0 \text { THEN } 1 \text { ELSE } n * \text { fact }[n-1]
$$

In general, a definition of the form $f[x \in S] \triangleq e$ can be used to define recursively a function $f$ with domain $S$. Section 7.3 explains exactly what such a definition means. For now, we will just write recursive definitions without worrying about their meaning.
extends MemoryInterface
variables mem, ctl, buf

IInit $\triangleq$ The initial predicate
$\wedge$ mem $\in[$ Adr $\rightarrow$ Val $]$
$\wedge c t l=[p \in \operatorname{Proc} \mapsto$ "rdy" $]$
$\wedge$ buf $=[p \in$ Proc $\mapsto$ NoVal $]$
$\wedge$ memInt $\in$ InitMemInt

Initially, memory locations have any legal values, each processor is ready to issue requests,
each buf $[p]$ is arbitrarily initialized to NoVal, and memInt is any element of InitMemInt.

TypeInvariant $\triangleq$ The type-correctness invariant.
$\wedge$ mem $\in[$ Adr $\rightarrow$ Val $] \quad m e m$ is a function from Adr to Val.
$\wedge c t l \in[\operatorname{Proc} \rightarrow\{$ "rdy", "busy", "done" $\}] \quad$ ctl $[p]$ equals "rdy", "busy", or "done".
$\wedge b u f \in[\operatorname{Proc} \rightarrow M R e q \cup \operatorname{Val} \cup\{\operatorname{NoVal}\}] \quad$ buf $[p]$ is a request or a response.
$\operatorname{Req}(p) \triangleq \quad$ Processor $p$ issues a request.
$\wedge c t l[p]=$ "rdy" Enabled iff $p$ is ready to issue a request.
$\wedge \exists r e q \in M R e q: \quad$ For some request req:
$\wedge \operatorname{Send}\left(p\right.$, req, memInt, memInt $\left.{ }^{\prime}\right) \quad$ Send req on the interface.
$\wedge b u f^{\prime}=[b u f$ EXCePt $![p]=r e q] \quad$ Set $b u f[p]$ to the request.
$\wedge c t l^{\prime}=[c t l \operatorname{EXCEPT}![p]=$ "busy" $] \quad$ Set $c t l[p]$ to "busy".
$\wedge$ UNCHANGED mem
$D o(p) \triangleq \quad$ Perform $p$ 's request to memory.
$\wedge c t l[p]=$ "busy" $\quad$ Enabled iff $p$ 's request is pending.
$\wedge m e m^{\prime}=\mathrm{IF} \quad$ buf $[p] . o p=" \mathrm{Wr}$ "
THEN [mem EXCEPT Write to memory on a "Wr" request.
$![b u f[p] . a d r]=b u f[p] . v a l]$
ELSE mem Leave mem unchanged on a "Rd" request.
$\wedge b u f^{\prime}=[b u f$ EXCEPT $![p]=\operatorname{IF}$ buf $[p]$. op $=$ "Wr" Set buf $[p]$ to the response:
THEN NoVal NoVal for a write;
ELSE mem $[b u f[p] . a d r]]$ the memory value for a read.
$\wedge c t l^{\prime}=[c t l \operatorname{EXCEPT}![p]=$ "done" $]$
Set $\operatorname{ctl}[p]$ to "done".
$\wedge$ UNCHANGED memInt
$R s p(p) \triangleq$ Return the response to $p$ 's request.
$\wedge c t l[p]=$ "done" Enabled iff request is done but response not sent.
$\wedge \operatorname{Reply}(p, \operatorname{buf}[p]$, memInt, memInt $) \quad$ Send the response on the interface.
$\wedge c t l^{\prime}=[c t l$ EXCEPT ! $[p]=$ "rdy" $] \quad$ Set $c t l[p]$ to "rdy".
$\wedge$ UnChanged $\langle m e m$, buf $\rangle$
INext $\triangleq \exists p \in \operatorname{Proc}: \operatorname{Req}(p) \vee \operatorname{Do}(p) \vee \operatorname{Rsp}(p) \quad$ The next-state action.
ISpec $\triangleq$ Innit $\wedge \square[I N e x t]_{\langle m e m I n t, \text { mem, ctl, buf }\rangle} \quad$ The specification.
THEOREM ISpec $\Rightarrow \square$ TypeInvariant

Figure 8: The internal memory specification.

Extends MemoryInterface
Inner $($ mem $, c t l, b u f) \triangleq$ INSTANCE InternalMemory
Spec $\triangleq \exists$ mem, ctl, buf : Inner (mem, ctl, buf)!ISpec

Figure 9: The memory specification.

### 6.6 A Write-Through Cache

We now specify a simple write-through cache that implements the memory specification. The system is described by the following picture:


Each processor $p$ communicates with a local controller, which maintains three state components: buf $[p]$, ctl $[p]$, and cache $[p]$. The value of cache $[p]$ represents the processor's cache; $b u f[p]$ and $c t l[p]$ play the same role as in the internal memory specification of module InternalMemory. (However, as we will see below, ctl $[p]$ can assume an additional value "waiting".) These local controllers communicate with the main memory mem, ${ }^{12}$ and with one another, over a bus. Requests from the processors to the main memory are in the queue mem $Q$ of maximum length QLen.

A write request by processor $p$ is performed by the action $\operatorname{DoWr}(p)$. This is a writethrough cache, meaning that every write request updates main memory. So, the DoWr $(p)$ action writes the value into cache $[p]$ and adds the write request to the tail of mem $Q$. It also updates cache $[q]$ for any other processor $q$ that has a copy of the address in its cache. When the request reaches the head of $m e m Q$, the action $M e m Q W r$ stores the value in $m e m$.

A read request by processor $p$ is performed by the action $\operatorname{DoRd}(p)$, which obtains the value from the cache. If the value is not in the cache, the action $R d M i s s(p)$ adds the request to the tail of $m e m Q$ and sets $\operatorname{ctl}[p]$ to "waiting". When the enqueued request reaches the head of $\operatorname{mem} Q$, the action $\operatorname{Mem} Q R d$ reads the value and puts it in cache $[p]$, enabling the $D o R d(p)$ action.

We might expect the $\operatorname{MemQRd}$ action to read the value from mem. However, this could cause an error if there is a write to that address enqueued in $m e m Q$ behind the read request. In that case, reading the value from memory could lead to two processors having different values for the address in their caches: the one that issued the read request, and the one that issued the write request that followed the read in $m e m Q$. So, the MemQRd action must read

[^7]the value from the last write to that address in $m e m Q$, if there is such a write; otherwise, it reads the value from mem.

Eviction of an address from processor $p$ 's cache is represented by a separate $\operatorname{Evict}(p)$ action. Since all cached values have been written to memory, eviction does nothing but remove the address from the cache. There is no reason to evict an address until the space is needed, so this action would be executed only when a request for an uncached address is received from $p$ and $p$ 's cache is full. But that's a performance optimization; it doesn't affect the correctness of the algorithm. We allow a cached address to be evicted from $p$ 's cache at any time except if the address was just put there by a MemQRd action for the current request. This is the case when $c t l[p]$ equals "waiting" and buf $[p]$.adr equals the cached address.

The actions $\operatorname{WReq}(p)$ and $\operatorname{WRsp}(p)$, which represent processor $p$ issuing a request and the memory issuing a reply to $p$, are the same as the corresponding actions of the memory specification, except that they also leave the new variables cache and mem $Q$ unchanged.

To specify all these actions, we must decide how the processor caches and the queue of requests to memory are represented by the variables $m e m Q$ and cache. We let mem $Q$ be a sequence of pairs of the form $\langle p$, req $\rangle$, where req is a request and $p$ is the processor that issued it. For any memory address $a$, we let cache $[p][a]$ be the value in $p$ 's cache for address $a$ (the "copy" of $a$ in $p$ 's cache). If $p$ 's cache does not have a copy of $a$, we let cache $[p][a]$ equal NoVal.

The specification appears in module WriteThroughCache in Figures 10 and 11. I'll now go through this specification, explaining some of the finer points and some notation that we haven't encountered before.

The extends, declaration statements, and AsSume are familiar. We can re-use some of the definitions from the InternalMemory module, so an instance statement instantiates a copy of that module. (The parameters of module InternalMemory are instantiated by the parameters of the same name in module WriteThroughCache.)

The initial predicate Init contains the conjunct $M$ !IInit, which asserts that mem, ctl, and buf have the same initial values as in the internal memory specification. The write-through cache allows $\operatorname{ctl}[p]$ to have the value "waiting" that it didn't in the internal memory specification, so we can't re-use the internal memory's type invariant M! TypeInvariant. Formula TypeInvariant therefore explicitly describes the types of mem, ctl, and buf. The type of $m e m Q$ is the set of sequences of $\langle$ processor, request〉 pairs.

The module next defines the predicate Coherence, which asserts the basic cache coherence property of the write-through cache: for any processors $p$ and $q$ and any address $a$, if $p$ and $q$ each has a copy of address $a$ in its cache, then those copies are equal. Note the trick of writing $x \notin\{y, z\}$ instead of the equivalent but longer formula $(x \neq y) \wedge(x \neq z)$.

The state function vmem is used in defining action $\operatorname{MemQRd}$ below. It is defined to equal the value that the main memory mem will have after all the write operations currently in $m e m Q$ have been performed. Recall that the value read by MemQRd must be the most recent one written to that address - a value that may still be in $m e m Q$. That value is the one in vmem. The definition of vmem introduces the TLA ${ }^{+}$LET/IN construct. A LET clause consists of a sequence of definitions, whose scope extends until the end of the in clause. In the definition of vmem, the LET clause defines the function $f$ so that $f[i]$ is the value mem will have after the first $i$ operations in $m e m Q$ have been performed. Note that $m e m Q[i][2]$ is the second component (the request) of $m e m Q[i]$, the $i$ th element in the sequence $m e m Q$.

The actions $\operatorname{Req}(p)$ and $\operatorname{Rsp}(p)$, which represent a processor sending a request and receiving a reply, are essentially the same as the corresponding actions in module InternalMemory. The only difference is that they must specify that the variables cache and mem $Q$, not present in module InternalMemory, are left unchanged.

In the definition of $R d M i s s$, the expression $\operatorname{Append}(\operatorname{mem} Q,\langle p, b u f[p]\rangle)$ is the sequence obtained by appending the element $\langle p, b u f[p]\rangle$ to the end of $m e m Q$.
extends Naturals, Sequences, MemoryInterface
variables mem, ctl, buf, cache, memQ
constant QLen
ASSUME $(Q L e n \in N a t) \wedge(Q L e n>0)$
$M \triangleq$ Instance InternalMemory
Init $\triangleq \quad$ The initial predicate
$\wedge M$ !IInit mem, buf, and $c t l$ are initialized as in the internal memory spec.
$\wedge$ cache $=\quad$ All caches are initially empty (cache $[p][a]=$ No Val for all $p, a)$. $[p \in$ Proc $\mapsto[a \in A d r \mapsto$ NoVal $]]$
$\wedge m e m Q=\langle \rangle \quad$ The queue $m e m Q$ is initially empty.
TypeInvariant $\triangleq$ The type invariant.
$\wedge$ mem $\in[$ Adr $\rightarrow$ Val $]$
$\wedge$ ctl $\in[$ Proc $\rightarrow$ \{"rdy", "busy", "waiting", "done" $\}]$
$\wedge$ buf $\in[$ Proc $\rightarrow$ MReq $\cup \operatorname{Val} \cup\{$ NoVal $\}]$
$\wedge$ cache $\in[$ Proc $\rightarrow[$ Adr $\rightarrow \operatorname{Val} \cup\{$ NoVal $\}]]$
$\wedge m e m Q \in \operatorname{Seq}(\operatorname{Proc} \times M R e q) \quad m e m Q$ is a sequence of $\langle$ proc., request〉pairs.
Coherence $\triangleq$ Asserts that if two processors' caches both have copies of an address, then those
$\forall p, q \in \operatorname{Proc}, a \in A d r: \quad$ copies have equal values.
(NoVal $\notin\{$ cache $[p][a]$, cache $[q][a]\}) \Rightarrow($ cache $[p][a]=$ cache $[q][a])$
$v m e m \triangleq$ The value $m e m$ will have after all the writes in $m e m Q$ are performed.
LET $f[i \in 0 \ldots \operatorname{Len}(m e m Q)] \triangleq$ The value $m e m$ will have after the first IF $i=0$ THEN mem $\quad i$ writes in mem $Q$ are performed. ELSE IF $m e m Q[i][2] . o p=" R d "$ THEN $f[i-1]$
ELSE $[f[i-1] \operatorname{except}![m e m Q[i][2] . a d r]=m e m Q[i][2]$. val $]$
IN $\quad f[\operatorname{Len}(m e m Q)]$
$\operatorname{Req}(p) \triangleq \quad$ Processor $p$ issues a request.
$M!\operatorname{Req}(p) \wedge$ UNCHANGED $\langle$ cache,$m e m Q\rangle$
$R s p(p) \triangleq \quad$ The system issues a response to processor $p$.
$M!R s p(p) \wedge$ UNCHANGED $\langle$ cache, mem $Q\rangle$
$\operatorname{RdMiss}(p) \triangleq$ Enqueue a request to write value from memory to $p$ 's cache.
$\wedge(c t l[p]=$ "busy" $) \wedge(b u f[p] . o p=$ "Rd" $) \quad$ Enabled on a read request when
$\wedge$ cache $[p][b u f[p] . a d r]=$ No Val the address is not in $p$ 's cache
$\wedge \operatorname{Len}($ mem $Q)<Q L e n$
$\wedge m e m Q^{\prime}=\operatorname{Append}(\operatorname{mem} Q,\langle p$, buf $[p]\rangle)$ and $m e m Q$ is not full.
$\wedge c t l^{\prime}=[c t l \operatorname{EXCEPT}![p]=$ "waiting"]
Append $\langle p$, request $\rangle$ to $m e m Q$.
$\wedge$ UnCHANGED $\langle$ memInt, mem, buf, cache〉
$\operatorname{DoRd}(p) \triangleq$ Perform a read by $p$ of a value in its cache.
$\wedge(c t l[p] \in\{$ "busy", "waiting" $\}) \wedge(b u f[p] . o p=$ "Rd" $) \quad$ Enabled if a read
$\wedge$ cache $[p][b u f[p] . a d r] \neq$ NoVal
$\wedge b u f^{\prime}=[b u f$ EXCEPT $![p]=$ cache $[p][b u f[p] . a d r]]$
$\wedge c t l^{\prime}=[c t l \operatorname{EXCEPT}![p]=$ "done" $]$ request is pending and
$\wedge$ UNCHANGED $\langle$ memInt, mem, cache, mem $Q\rangle$ addess is cache.
Get result from cache.
Set $c t l[p]$ to "done".

Figure 10: The write-through cache specification (beginning).
$\operatorname{DoWr}(p) \triangleq$ Write to $p$＇s cache，update other caches，and enqueue memory update．
LET $r \triangleq b u f[p]$ Processor $p$＇s request．
IN $\wedge(c t l[p]=$＂busy＂$) \wedge(r . o p=$＂Wr＂）Enabled if write request pending
$\wedge \operatorname{Len}(m e m Q)<Q L e n \quad$ and $m e m Q$ is not full．
$\wedge$ cache ${ }^{\prime}=$ Update $p$＇s cache and any other cache that has a copy．
$[q \in \operatorname{Proc} \mapsto \mathrm{IF}(p=q) \vee($ cache $[q][r . a d r] \neq$ NoVal $)$ THEN $[$ cache $[q]$ EXCEPT $![r . a d r]=r . v a l]$ ELSE cache $[q]]$
$\wedge m e m Q^{\prime}=\operatorname{Append}(\operatorname{mem} Q,\langle p, r\rangle) \quad$ Enqueue write at tail of memQ．
$\wedge b u f^{\prime}=[b u f$ EXCEPT $![p]=$ NoVal $] \quad$ Generate response.
$\wedge c t l^{\prime}=[c t l \operatorname{EXCEPT}![p]=$＂done＂$] \quad$ Set $c t l$ to indicate request is done．
$\wedge$ UNCHANGED $\langle$ memInt，mem〉
$M e m Q W r \triangleq$ Perform write at head of mem $Q$ to memory．
LET $r \triangleq \operatorname{Head}(\operatorname{mem} Q)[2] \quad$ The request at the head of memQ．
IN $\wedge(\operatorname{mem} Q \neq\langle \rangle) \wedge(r . o p=" W r ") \quad$ Enabled if $\operatorname{Head}(\operatorname{mem} Q)$ a write．
$\wedge \mathrm{mem}^{\prime}=[$ mem EXCEPT ！$[r . a d r]=r . v a l] \quad$ Perform the write to memory．
$\wedge m e m Q^{\prime}=\operatorname{Tail}(m e m Q) \quad$ Remove the write from mem $Q$ ．
$\wedge$ UNCHANGED 〈memInt，mem，buf，ctl，cache〉
MemQRd $\triangleq$ Perform an enqueued read to memory．
LET $p \triangleq \operatorname{Head}(\operatorname{mem} Q)[1] \quad$ The requesting processor．
$r \triangleq \operatorname{Head}(\operatorname{mem} Q)[2] \quad$ The request at the head of $\operatorname{mem} Q$ ．
IN $\wedge(\operatorname{mem} Q \neq\langle \rangle) \wedge(r . o p=$＂Rd＂$)$ Enabled if $\operatorname{Head}(\operatorname{mem} Q)$ is a read．
$\wedge m e m Q^{\prime}=\operatorname{Tail}(m e m Q) \quad$ Remove the head of $m e m Q$.
$\wedge$ cache $^{\prime}=\quad$ Put value from memory or memQ $[$ cache EXCEPT $![p][r . a d r]=v m e m[r . a d r]] \quad$ in $p$＇s cache．
$\wedge$ UnCHANGED $\langle$ memInt，mem，buf，ctl $\rangle$
$\operatorname{Evict}(p, a) \triangleq$ Remove address $a$ from $p$＇s cache．
$\wedge(c t l[p]=$＂waiting＂$) \Rightarrow(b u f[p] . a d r \neq a) \quad$ Can＇t evict $a$ if it was just read
$\wedge$ cache $^{\prime}=[$ cache EXCEPT $![p][a]=$ NoVal $] \quad$ into cache from memory．
$\wedge$ UnCHANGED $\langle$ memInt，mem，buf，ctl，mem $Q\rangle$
$\begin{aligned} \text { Next } \triangleq \vee \exists p \in \operatorname{Proc}: & \vee \operatorname{Req}(p) \vee \operatorname{Rsp}(p) \\ & \vee \operatorname{RdMiss}(p) \vee \operatorname{DoRd}(p) \vee \operatorname{DoWr}(p) \\ & \vee \exists a \in \operatorname{Adr}: \operatorname{Evict}(p, a) \\ \vee \operatorname{MemQWr} \vee & \operatorname{MemQRd}\end{aligned}$
Spec $\triangleq$ Init $\wedge \square[N e x t]_{\langle\text {memInt，mem，buf，ctl，cache，memQ }\rangle}$
THEOREM Spec $\Rightarrow \square($ TypeInvariant $\wedge$ Coherence）
$L M \triangleq$ instance Memory The memory spec．with internal variables hidden．
THEOREM Spec $\Rightarrow L M!$ Spec $\quad$ Formula Spec implements the memory spec．
Figure 11：The write－through cache specification（end）．

The $\operatorname{DoRd}(p)$ action represents the performing of the read from $p$ 's cache. If $\operatorname{ctl}[p]=$ "busy", then the address was originally in the cache. If $c t l[p]=$ "waiting", then the address was just read into the cache from memory.

The $\operatorname{DoWr}(p)$ action writes the value to $p$ 's cache and updates the value in any other caches that have copies. It also enqueues a write request in $m e m Q$. In an implementation, the request is put on the bus, which transmits it to the other caches and to the $m e m Q$ queue. In our high-level view of the system, we represent all this as a single step.

In the definition of $D o W r$, the LET clause defines $r$ to equal buf $[p]$ within the in clause. Observe that the definition of $r$ contains the parameter $p$ of the definition of DoWr. Hence, we could not move the definition of $r$ outside the definition of DoWr.

A definition in a LET is just like an ordinary definition in a module; in particular, it can have parameters. These local definitions can be used to shorten an expression by replacing common subexpressions with an operator. In the definition of DoWr , I replaced five instances of $b u f[p]$ by the single symbol $r$. This was a silly thing to do, because it makes almost no difference in the length of the definition and it requires the reader to remember the definition of the new symbol $r$. But using a Let to eliminate common subexpressions can often greatly shorten and simplify an expression.

A Let can also be used to make an expression easier to read, even if the operators it defines appear only once in the IN expression. We write a specification with a sequence of definitions, instead of just defining a single monolithic formula, because a formula is easier to understand when presented in smaller chunks. The LET construct allows the process of splitting a formula into smaller parts to be done hierarchically. A LET can appear as a subexpression of an in expression. Nested lets are common in large, complicated specifications.

Returning to module WriteThroughCache, we next come to the two actions MemQWr and $M e m Q R d$. They represent the processing of the request at the head of the $m e m Q$ queue$M e m Q W r$ for a write request, and $M e m Q R d$ for a read request. These actions also use a LET to make local definitions. Here, the definitions of $p$ and $r$ could be moved before the definition of $\operatorname{MemQWr}$. In fact, we could save space by replacing the two local definitions of $r$ with one global (within the module) definition. However, making the definition of $r$ global in this way would be somewhat distracting, since $r$ is used only in the definitions of $M e m Q W r$ and $M e m Q R d$. It might be better instead to combine these two actions into one. Whether you put a definition into a LET or make it more global should depend on what makes the specification easier to read. Writing specifications is a craft whose mastery requires talent and hard work.

The $\operatorname{Evict}(p, a)$ action represents the operation of removing address $a$ from processor $p$ 's cache. As explained above, we allow an address to be evicted at any time - unless the address was just written to satisfy a pending read request, which is the case iff $\operatorname{ctl}[p]=$ "waiting" and $b u f[p] . a d r=a$. Note the use of the "double subscript" in the EXCEPT expression of the action's second conjunct. This conjunct "assigns NoVal to cache $[p][a]$ ". If address $a$ is not in $p$ 's cache, then cache $[p][a]$ already equals $\operatorname{NoVal}$ and an $\operatorname{Evict}(p, a)$ step is a stuttering step.

The definitions of the next-state action Next and of the complete specification Spec are straightforward. The module closes with two theorems that are discussed below.

### 6.7 Invariance

Module WriteThroughCache contains the theorem

$$
\text { THEOREM } \text { Spec } \Rightarrow \square(\text { TypeInvariant } \wedge \text { Coherence })
$$

which asserts that TypeInvariant $\wedge$ Coherence is an invariant of Spec. A state predicate $P \wedge Q$ is always true iff both $P$ and $Q$ are always true, so $\square(P \wedge Q)$ is equivalent to $\square P \wedge \square Q$. This implies that the theorem above is equivalent to the two theorems:

THEOREM Spec $\Rightarrow \square$ TypeInvariant
THEOREM $S p e c \Rightarrow \square$ Coherence

The first theorem is the usual type-invariance assertion. The second, which asserts that Coherence is an invariant of Spec, expresses an important property of the algorithm.

Although TypeInvariant and Coherence are both invariants of the temporal formula Spec, they differ in a fundamental way. If $s$ is any state satisfying TypeInvariant, then any state $t$ such that $s \rightarrow t$ is a Next step also satisfies TypeInvariant. This property is expressed by:

## THEOREM TypeInvariant $\wedge$ Next $\Rightarrow$ TypeInvariant ${ }^{\prime}$

(Recall that TypeInvariant ${ }^{\prime}$ is the formula obtained by priming all the variables in formula TypeInvariant.) In general, when $P \wedge N \Rightarrow P^{\prime}$ holds, we say that predicate $P$ is an invariant of action $N .{ }^{13}$ Predicate TypeInvariant is an invariant of Spec because it is an invariant of Next and it is implied by the initial predicate Init.

Predicate Coherence is not an invariant of the next-state action Next. Suppose $s$ is a state in which

- $\operatorname{cache}[p 1][a]=1$
- cache $[q][b]=$ NoVal, for all $\langle q, b\rangle$ different from $\langle p 1, a\rangle$
- $\operatorname{mem}[a]=2$
- mem $Q$ contains the single element $\langle p 2,[o p \mapsto " R d ", a d r \mapsto a]\rangle$
for two different processors $p 1$ and $p 2$ and some address $a$. Then Coherence is true in state $s$. Let $t$ be the state obtained from $s$ by taking a MemQRd step. In state $t$, we have cache $[p 2][a]=2$ and cache $[p 1][a]=1$, so Coherence is false. Hence Coherence is not an invariant of the next-state action.

Coherence is an invariant of formula Spec because states like $s$ cannot occur in a behavior satisfying Spec. Proving its invariance is not so easy. We must find a predicate Inv that is an invariant of Next such that Inv implies Coherence and is implied by the initial predicate Init.

Important properties of a specification can often be expressed as invariants. Proving that a state predicate $P$ is an invariant of a specification means proving a formula of the form

$$
\text { Init } \wedge \square[N e x t]_{v} \Rightarrow \square P
$$

This is done by finding an appropriate state predicate Inv and proving

$$
\text { Init } \Rightarrow \text { Inv, } \quad \text { Inv } \wedge[N e x t]_{v} \Rightarrow I n v^{\prime}, \quad \text { Inv } \Rightarrow P
$$

Since our subject is specification, not proof, I won't discuss how to find Inv.

### 6.8 Proving Implementation

Module WriteThroughCache ends with the theorem

$$
\text { THEOREM } S p e c \Rightarrow L M!\text { Spec }
$$

where LM!Spec is formula Spec of module Memory. By definition of this formula, we can restate the theorem as

$$
\text { THEOREM Spec } \Rightarrow \exists \text { mem, ctl, buf : LM ! Inner (mem, ctl, buf)!ISpec }
$$

where LM!Inner(mem, ctl,buf)!ISpec is formula ISpec of the InnerMemory module. The rules of logic tell us that to prove such a theorem, we must find "witnesses" for the quantified

[^8]variables mem, ctl, and buf. These witness are state functions (ordinary expressions with no primes), which I'll call omem, octl, and obuf, that satisfy:
\[

$$
\begin{equation*}
\text { Spec } \Rightarrow \text { LM!Inner }(\text { omem, octl, obuf })!\text { ISpec } \tag{5}
\end{equation*}
$$

\]

The tuple $\langle$ omem, octl, obuf $\rangle$ of witness functions is called a refinement mapping, and we describe (5) as the assertion that Spec implements formula ISpec (of module InnerMemory) under this refinement mapping. Intuitively, this means Spec implies that the value of the tuple of state functions $\langle$ memInt, omem, octl, obuf $\rangle$ changes the way ISpec asserts that the tuple of variables $\langle$ memInt, mem, ctl, buf $\rangle$ should change.

I will now briefly describe how we prove (5); for details, see [8]. Let me first introduce a bit of non-TLA ${ }^{+}$notation. For any formula $F$ of module InnerMemory, let $\bar{F}$ equal $L M!$ Inner (omem, octl, obuf)! $F$, which is formula $F$ with omem, octl, and obuf substituted for mem, ctl, and buf. In particular, $\overline{m e m}, \overline{c t l}$, and $\overline{b u f}$ equal omem, octl, and obuf, respectively.

Replacing Spec and ISpec by their definitions transforms (5) to

$$
\begin{aligned}
& \text { Init } \wedge \square[\text { Next }]_{\langle\text {memInt }, \text { mem, buf }, \text { ctll, cache, mem } Q\rangle} \\
& \left.\quad \Rightarrow \overline{\text { Init }} \wedge \square[\overline{\text { INext }}]_{\langle\text {memInt },}, \overline{m e m}, \overline{c l l}, \overline{b u f}\right\rangle
\end{aligned}
$$

This is proved by finding an invariant Inv of Spec such that

$$
\begin{aligned}
& \wedge \text { Init } \Rightarrow \overline{\text { Innit }} \\
& \wedge \text { Inv } \wedge \text { Next } \Rightarrow \vee \overline{\text { INext }} \\
& \vee \text { UNCHANGED }\langle\text { memInt, } \overline{m e m}, \overline{c t l}, \overline{b u f}\rangle
\end{aligned}
$$

The second conjunct is called step simulation. It asserts that a Next step starting in a state satisfying the invariant $I n v$ is either an $\overline{I N e x t}$ step-a step that changes the 4 -tuple $\langle$ memInt, omem, octl, obuf $\rangle$ the way an INext step changes $\langle$ memInt, mem, ctl, buf $\rangle$-or else it leaves that 4 -tuple unchanged.

The mathematics of an implementation proof is simple, so the proof is straightforward-in theory. For specifications of real systems, such proofs can be quite difficult. Going from the theory to practice requires turning the mathematics of proofs into an engineering disciplinea subject that deserves a book to itself. However, when writing specifications, it helps to understand refinement mappings and step simulation.

We now return to the question posed in Section 4.2: what is the relation between the specifications of the asynchronous interface in modules AsynchInterface and Channel? Recall that module AsynchInterface describes the interface in terms of the three variables val, rdy, and ack, while module Channel describes it with a single variable chan whose value is a record with val, rdy, and ack components. In what sense are those two specifications of the interface equivalent?

One answer that now suggests itself is that each of the specifications should implement the other under a refinement mapping. We expect formula Spec of module Channel to imply the formula obtained from Spec of module AsynchInterface by substituting for its variables $v a l, r d y$, and ack the val, rdy, and ack components of the variable chan of module Channel. This assertion is expressed precisely by the theorem in the following module.

| EXTENDS Channel |
| :--- |
| AInt (val, rdy, ack $) \triangleq$ InsTANCE AsynchInterface ChannelImplAsynch |
| THEOREM Spec $\Rightarrow$ AInt(chan.val, chan.rdy, chan.ack)!Spec |

In this example, the refinement mapping substitutes $\langle$ chan.val, chan.rdy, chan.ack $\rangle$ for the tuple $\langle v a l, r d y, a c k\rangle$ of variables in the formula Spec of module AsynchInterface.

Similarly, formula Spec of module AsynchInterface implies formula Spec of module Channel with chan replaced by the record-valued expression:

$$
[v a l \mapsto v a l, r d y \mapsto r d y, a c k \mapsto a c k]
$$

(The first val in val $\mapsto$ val is the field name in the record constructor, while the second val is the variable of module AsynchInterface.)

## 7 Some More Math

Our mathematics is built on a small, simple collection of concepts. You've already seen most of what's needed to describe almost any kind of mathematics. All you lack are a handful of operators on sets that are described below in Section 7.1. After learning about them, you will be able to define all the data structures and operations that occur in specifications.

While our mathematics is simple, its foundations are nonobvious-for example, the meanings of recursive function definitions and the CHOOSE operator are subtle. This section discusses some of those foundations. Understanding them will help you use mathematics more effectively.

### 7.1 Sets

The simple operations on sets described in Section 2.2 are all you'll need for writing most system specifications. However, you may occasionally have to use more sophisticated operatorsespecially if you need to define data structures beyond tuples, records, and simple functions.

Two powerful operators of set theory are the unary operators UNION and SUBSET, defined as follows. ${ }^{14}$

Union $S$ The union of the elements of $S$. In other words, a value $e$ is an element of union $S$ iff it is an element of an element of $S$. For example:

$$
\text { UNION }\{\{1,2\},\{2,3\},\{3,4\}\}=\{1,2,3,4\}
$$

SUBSET $S$ The set of all subsets of $S$. In other words, $T \in \operatorname{SUBSET} S$ iff $T \subseteq S$. For example:

$$
\operatorname{SubSET}\{1,2\}=\{\{ \},\{1\},\{2\},\{1,2\}\}
$$

Mathematicians often describe a set as "the set of all ... such that ...". TLA ${ }^{+}$has two constructs that formalize such a description:
$\{x \in S: P\}$ The subset of $S$ consisting of all elements $x$ satisfying property $P$. For example, the set of odd natural numbers can be written $\{n \in N a t: n \% 2=1\}$. (The modulus operator $\%$ is described in Section 3.5.) The identifier $x$ is bound in $P$; it may not occur in $S$.
$\{e: x \in S\}$ The set of elements of the form $e$, for all $x$ in the set $S$. For example, $\{2 * n+1: n \in N a t\}$ is the set of all odd natural numbers. The identifier $x$ is bound in $e$; it may not occur in $S$.

[^9]The construct $\{e: x \in S\}$ has the same generalizations as $\exists x \in S: F$. For example,

$$
\{e: x \in S, y \in T\}
$$

is the set of all elements of the form $e$, for $x$ in $S$ and $y$ in $T$. In the construct $\{x \in S: P\}$, we can let $x$ be a tuple. For example, $\{\langle y, z\rangle \in S: P\}$ is the set of all pairs $\langle y, z\rangle$ in the set $S$ that satisfy $P$.

All the set operators we've seen so far are built-in operators of TLA ${ }^{+}$. There is also a FiniteSets module that defines two operators:
$\operatorname{Cardinality}(S)$ The number of elements in set $S$, if $S$ is a finite set.
IsFiniteSet ( $S$ ) True iff $S$ is a finite set.
Careless reasoning about sets can lead to problems. The classic example of this is Russell's paradox:

Let $\mathcal{R}$ be the set of all sets $S$ such that $S \notin S$. The definition of $\mathcal{R}$ implies that $\mathcal{R}$ is an element of $\mathcal{R}$ iff $\mathcal{R}$ is not an element of $\mathcal{R}$ is true.

Obviously, $\mathcal{R}$ can't both be and not be an element of $\mathcal{R}$. The source of the paradox is that $\mathcal{R}$ isn't a set. There's no way to write it in TLA ${ }^{+}$. Intuitively, $\mathcal{R}$ is too big to be a set. A collection $\mathcal{C}$ is too big to be a set if it is as big as the collection of all sets - meaning that we can assign to every set a different element of $\mathcal{C}$. More precisely, $\mathcal{C}$ is too big to be a set if we can define an operator SMap such that:

- $\operatorname{SMap}(S)$ is in $\mathcal{C}$, for any set $S$.
- If $S$ and $T$ are two different sets, then $\operatorname{SMap}(S) \neq \operatorname{SMap}(T)$.

For example, the collection of all sequences of length 2 is too big to be a set; we can define the operator SMap by

$$
S M a p(S) \triangleq\langle 1, S\rangle
$$

### 7.2 Silliness

Most modern programming languages introduce some form of type checking to prevent you from writing silly expressions like $3 /$ "abc". TLA ${ }^{+}$is based on the usual formalization of mathematics, which doesn't have types. In an untyped formalism, every syntactically wellformed expression has a meaning - even a silly expression like $3 /$ "abc". Mathematically, the expression $3 /$ "abc" is no sillier than the expression $3 / 0$, and mathematicians implicitly write that silly expression all the time. For example, consider the valid formula

$$
\forall x \in \text { Real }:(x \neq 0) \Rightarrow(x *(3 / x)=3)
$$

where Real is the set of all real numbers. This asserts that $(x \neq 0) \Rightarrow(x *(3 / x)=3)$ is true for all real numbers $x$. Substituting 0 for $x$ yields the valid formula $(0 \neq 0) \Rightarrow(0 *(3 / 0)=3)$ that contains the silly expression $3 / 0$. It's valid because $0 \neq 0$ equals False, and false $\Rightarrow P$ is true for any formula $P$.

A correct formula can contain silly expressions. However, the validity of a correct formula cannot depend on the meaning of a silly expression. If an expression is silly, then its meaning is probably unspecified. The definitions of $0,3, /$, and $*$ (which are in the standard module Reals) don't specify the value of $0 *(3 / 0)$, so there's no way of knowing whether that value equals 3 .

No sensible syntactic rules can prevent you from writing $3 / 0$ without also preventing you from writing perfectly reasonable expressions. The typing rules of programming languages introduce complexity and limitations on what you can write that don't exist in ordinary mathematics. In a well-designed programming language, the costs of types are balanced by benefits: types allow a compiler to produce more efficient code, and type checking catches errors. For programming languages, the benefits seem to outweigh the costs. For writing specifications, I have found that the costs outweigh the benefits.

If you're used to the constraints of programming languages, it may be a while before you start taking advantage of the freedom afforded by mathematics. At first, you won't think of defining anything like the operator $R$ defined in Section 6.2 , which couldn't be written in a typed programming language.

Mathematically, what a specification means is equivalent to what the true assertions about it are. Assertions are expressed with Boolean operators, so we need to pay specific attention to silly expressions written with those operators - expressions like $5 \wedge$ "abc". What does such an expression mean? There are several ways of answering this question, but there are two principle ones that I call the conservative and liberal views.

In the conservative view, the value of an expression like $2 \wedge$ "abc" is completely unspecified. It could equal $\sqrt{2}$. It need not equal "abc" $\wedge 2$. Hence, the ordinary laws of logic, such as the commutativity of $\wedge$, are valid only for Boolean values.

In the liberal interpretation, the value of $2 \wedge$ "abc" is a Boolean. It equals either TRUE or false, but we don't know which. However, all the ordinary laws of logic, such as the commutativity of $\wedge$, are valid. Hence, $2 \wedge$ "abc" equals "abc" $\wedge 2$. More precisely, any tautology of propositional or predicate logic, such as

$$
(\forall x: p) \equiv \neg(\exists x: \neg p)
$$

is valid, even for nonBoolean values. ${ }^{15}$
The semantics of TLA ${ }^{+}$assert that the rules of the conservative view are valid. The liberal view is neither required nor forbidden. You should write specifications that make sense under the conservative view. However, you (and the implementer of a tool) are free to use the liberal interpretation if you wish.

### 7.3 Recursive Function Definitions Revisited

Section 6.5 introduced recursive function definitions. Let's now examine what such definitions mean mathematically. Mathematicians usually define the factorial function fact by writing:

$$
\text { fact }[n]=\text { IF } n=0 \text { THEN } 1 \text { ELSE } n * \text { fact }[n-1], \text { for all } n \in N a t
$$

This definition can be justified by proving that it defines a unique function fact with domain Nat. In other words, fact is the unique value satisfying:
(6) $\quad$ fact $=[n \in N a t \mapsto$ IF $n=0$ THEN 1 ELSE $n * f a c t[n-1]]$

The choose operator, introduced in Section 6.1, allows us to express "the value satisfying property $P$ " as Choose $x: P$. We can therefore define fact as follows to be the value satisfying (6):
(7) fact $\triangleq$ Choose fact : fact $=[n \in N a t \mapsto \mathrm{IF} n=0$ THEN 1 ELSE $n *$ fact $[n-1]]$
(Since the symbol fact is not yet defined in the expression to the right of the $\triangleq$, we can use it as the bound identifier in the choose expression.) The TLA ${ }^{+}$definition

$$
\text { fact }[n \in N a t] \triangleq \text { IF } n=0 \text { THEN } 1 \text { ELSE } n * \text { fact }[n-1]
$$

[^10]is simply an abbreviation for (7). In general, $f[x \in S] \triangleq e$ is an abbreviation for:
\[

$$
\begin{equation*}
f \triangleq \text { choose } f: f=[x \in S \mapsto e] \tag{8}
\end{equation*}
$$

\]

TLA ${ }^{+}$allows you to write silly definitions. For example, you can write

$$
\begin{equation*}
\operatorname{circ}[n \in N a t] \triangleq \text { Choose } y: y \neq \operatorname{circ}[n] \tag{9}
\end{equation*}
$$

This appears to define $\operatorname{circ}$ to be a function such that $\operatorname{circ}[n] \neq \operatorname{circ}[n]$ for any natural number $n$. There obviously is no such function, so circ can't be defined to equal it. A recursive function definition doesn't necessarily define a function. If there is no $f$ that equals $[x \in S \mapsto e]$, then (8) defines $f$ to be some unspecified value. Thus, the nonsensical definition (9) defines circ to be some unknown value.

If we want to reason about a function $f$ defined by $f[x \in S] \triangleq e$, we need to prove that there exists an $f$ that equals $[x \in S \mapsto e]$. The existence of $f$ is obvious if $f$ does not occur in $e$. If it does, so this is a recursive definition, then there is something to prove. Since I'm not discussing proofs, I won't describe how to prove it. Intuitively, you have to check that, as in the case of the factorial function, the definition uniquely determines the value of $f[x]$ for every $x$ in $S$.

Recursion is a common programming technique because programs must compute values using a small repertoire of simple elementary operations. It's not used so often in mathematical definitions, where we needn't worry about how to compute the value and can use the powerful operators of logic and set theory. For example, the operators Head, Tail, and $\circ$ are defined in Section 6.4 without recursion, even though computer scientists usually define them recursively. Still, there are some things that are best defined inductively, using a recursive function definition.

### 7.4 Functions versus Operators

Consider these definitions, which we've seen before

$$
\begin{aligned}
& \operatorname{Tail}(s) \triangleq[i \in 1 \ldots(\operatorname{Len}(s)-1) \mapsto s[i+1]] \\
& \operatorname{fact}[n \in \operatorname{Nat}] \triangleq \mathrm{IF} n=0 \text { THEN } 1 \text { ELSE } n * \text { fact }[n-1]
\end{aligned}
$$

They define two very different kinds of objects: fact is a function, and Tail is an operator. Functions and operators differ in a few basic ways.

Their most obvious difference is that a function like fact by itself is a complete expression that denotes a value, but an operator like Tail is not. Both fact $[n] \in S$ and fact $\in S$ are syntactically correct expressions. But, while $\operatorname{Tail}(n) \in S$ is syntactically correct, Tail $\in S$ is not. It is gibberish-a meaningless string of symbols, like $x+=0$.

Their second difference is more profound. The definition of Tail defines Tail(s) for all values of $s$. For example, it defines $\operatorname{Tail}(1 / 2)$ to equal

$$
\begin{equation*}
[i \in 1 \ldots(\operatorname{Len}(1 / 2)-1) \mapsto(1 / 2)[i+1]] \tag{10}
\end{equation*}
$$

We have no idea what this expression means, because we don't know what $\operatorname{Len}(1 / 2)$ or $(1 / 2)[i+1]$ mean. But, whatever (10) means, it equals Tail $(1 / 2)$.

The definition of fact defines fact $[n]$ only for $n \in$ Nat. It tells us nothing about the value of fact $[1 / 2]$. The expression fact [1/2] is syntactically well-formed, so it denotes some value. But the definition of fact tells us nothing about what that value is.

Unlike an operator, a function must have a domain, which is a set. We cannot define a function Tail so that Tail $[s]$ is the tail of any nonempty sequence $s$; the domain of such a function would have to include all nonempty sequences, and the collection of all such sequences is too big to be a set. (The operator SMap defined by $\operatorname{SMap}(S) \triangleq\langle S\rangle$ maps every set to a different nonempty sequence.) Hence, we can't define Tail to be a function.

Unlike a function, an operator cannot be defined recursively. However, we can usually transform an illegal recursive operator definition into a nonrecursive one using a recursive function definition. For example, let's try to define the Cardinality operator on the set of finite sets. (Recall that the cardinality of a finite set $S$ is the number of elements in $S$.) The collection of all finite sets is too big to be a set. (The operator $\operatorname{SMap}(S) \triangleq\{S\}$ maps every set $S$ to a different set $\{S\}$ of cardinality 1.) The Cardinality operator has a simple intuitive definition:

- $\operatorname{Cardinality~}(\})=0$.
- If $S$ is a nonempty finite set, then

$$
\operatorname{Cardinality}(S)=1+\operatorname{Cardinality}(S \backslash\{x\})
$$

where $x$ is an arbitrary element of $S$. (The set $S \backslash\{x\}$ contains all the elements of $S$ except $x$.)

Using the choose operator to describe an arbitrary element of $S$, we can write this as the more formal-looking, but still illegal, definition:

$$
\begin{aligned}
& \operatorname{Cardinality}(S) \triangleq \operatorname{IF} S=\{ \} \text { THEN } 0 \\
& \text { ELSE } 1+\operatorname{Cardinality}(S \backslash\{\text { Choose } x: x \in S\})
\end{aligned}
$$

This definition is illegal because it's circular-only in a recursive function definition can the symbol being defined appear to the right of the $\triangleq$.

To turn this into a legal definition, observe that, for a given set $S$, we can define a function $C S$ such that $C S[T]$ equals the cardinality of $T$ for every subset $T$ of $S$. The definition is

$$
\begin{aligned}
C S[T \in \operatorname{SUBSEt} S] \triangleq \operatorname{IF} T=\{ \} & \begin{array}{l}
\text { THEN } 0 \\
\\
\quad \operatorname{ELSE} 1+C S[T \backslash\{\operatorname{ChOOSE} x: x \in T\}]
\end{array}
\end{aligned}
$$

Since $S$ is a subset of itself, this defines $C S[S]$ to equal $\operatorname{Cardinality}(S)$, if $S$ is a finite set. (We don't know or care what $C S[S]$ equals if $S$ is not finite.) So, we can define the Cardinality operator by:

$$
\begin{aligned}
& \operatorname{Cardinality}(S) \triangleq \operatorname{LET} \operatorname{CS}[T \in \operatorname{SUBSET} S] \triangleq \\
& \text { IF } T=\{ \} \text { THEN } 0 \\
& \text { ELSE } 1+C S[T \backslash\{\text { choose } x: x \in T\}] \\
& \text { IN } \quad C S[S]
\end{aligned}
$$

Operators also differ from functions in that an operator can take an operator as an argument. For example, we can define an operator IsPartialOrder so that IsPartialOrder $(R, S)$ equals true iff the operator $R$ defines an irreflexive partial order ${ }^{16}$ on $S$. The definition is

$$
\begin{aligned}
\operatorname{IsPartialOrder}(R(-,-), S) \triangleq & \wedge \forall x, y, z \in S: R(x, y) \wedge R(y, z) \Rightarrow R(x, z) \\
& \wedge \forall x \in S: \neg R(x, x)
\end{aligned}
$$

We could also use an infix-operator symbol like $\prec$ instead of $R$ as the parameter of the definition, writing:

$$
\begin{aligned}
\text { IsPartialOrder }\left(\_\prec-, S\right) \triangleq & \wedge \forall x, y, z \in S:(x \prec y) \wedge(y \prec z) \Rightarrow(x \prec z) \\
& \wedge \forall x \in S: \neg(x \prec x)
\end{aligned}
$$

[^11]The first argument of IsPartialOrder is an operator that takes two arguments; its second argument is an expression. Since $>$ is an operator that takes two arguments, the expression IsPartialOrder $(>, N a t)$ is syntactically correct. In fact, it's valid-if $>$ is defined to be the usual operator on numbers. The expression $\operatorname{IsPartialOrder}(+, 3)$ is also syntactically correct, but it's silly and we have no idea whether or not it's valid.

The last difference between operators and functions has nothing to do with mathematics and is an idiosyncrasy of $\mathrm{TLA}^{+}$: the language doesn't permit you to define infix functions. So, if we want to define /, we have no choice but to make it an operator.

When defining an object $V$, you may have to decide whether to make $V$ an operator or a function. The differences between operators and functions will often determine the decision. For example, if a variable may have $V$ as its value, then $V$ must be a function. Thus, in the memory specification of Section 6.3 , we had to represent the state of the memory by a function rather than an operator, since the variable mem couldn't equal an operator. If these differences don't determine whether to use an operator or a function, then it's a matter of taste. I usually prefer operators.

### 7.5 Using Functions

Consider the following two formulas:

$$
\begin{align*}
& f^{\prime}=[i \in N a t \mapsto i+1]  \tag{11}\\
& \forall i \in N a t: f^{\prime}[i]=i+1 \tag{12}
\end{align*}
$$

These formulas both imply that $f^{\prime}[i]=i+1$ for every natural number $i$, but they are not equivalent. Formula (11) uniquely determines $f^{\prime}$, asserting that it's a function with domain Nat. Formula (12) is satisfied by lots of different values of $f^{\prime}$-for example, by the function

$$
[i \in \text { Real } \mapsto \mathrm{IF} \quad i \in N a t \text { THEN } i+1 \text { ELSE } \sqrt{i}]
$$

In fact, from (12), we can't even deduce that $f^{\prime}$ is a function. Formula (11) implies formula (12), but not vice-versa.

When writing specifications, we almost always want to specify the new value of a variable $f$ rather than the new values of $f[i]$ for all $i$ in some set. We therefore usually write (11) rather than (12),

### 7.6 Choose

The ChOOSE operator ${ }^{17}$ was introduced in the memory interface of Section 6.1 in the simple idiom choose $v: v \notin S$, which is an expression whose value is not an element of $S$. In Section 7.3 above, we saw that it is a powerful tool that can be used in rather subtle ways.

The most common use for the choose operator is to "name" a uniquely specified value. For example, one possible definition of division on the set Real of real numbers is:

$$
r / s \triangleq \text { Choose } v \in \operatorname{Real}:(v * s=r)
$$

(The expression choose $x \in S: e$ means choose $x:(x \in S) \wedge e$.) If $r$ is a nonzero real number, then there is no real number $v$ such that $v * 0=r$. Therefore, $r / 0$ has an unspecified value. We don't know what a real number times a string equals, so we cannot say whether or not there is a real number $v$ such that $v *$ "abc" equals $r$. Hence, we don't know what the value of $r / " a b c$ " is.

People who do a lot of programming and not much mathematics often think that choose must be a nondeterministic operator. In mathematics, there is no such thing as a nondeterministic operator or a nondeterministic function. If some expression equals 42 today, then

[^12]it will equal 42 tomorrow, and it will still equal 42 a million years from tomorrow. The specification
$$
(x=\text { Choose } n: n \in \text { Nat }) \wedge \square\left[x^{\prime}=\text { Choose } n: n \in N a t\right]_{x}
$$
allows only a single behavior-one in which $x$ always equals the number choose $n: n \in$ Nat, some particular, unspecified natural number. It is very different from the specification
$$
(x \in N a t) \wedge \square\left[x^{\prime} \in N a t\right]_{x}
$$
that allows all behaviors in which $x$ is always a natural number-possibly a different number in each state. This specification is highly nondeterministic, allowing lots of different behaviors.

## 8 Liveness and Fairness

The specifications we have written so far say what a system must not do. The clock must not advance from 11 to 9 ; the receiver must not receive a message if the FIFO is empty. They don't require that the system ever do anything. The clock need never tick; the sender need never send any messages. Our specifications have described what are called safety properties. If a safety property is violated, it is violated at some particular point in the behavior-by a step that advances the clock from 11 to 9 , or that reads the wrong value from memory.

We now learn how to specify that something does happen: the clock keeps ticking; a value is eventually read from memory. We specify liveness properties, which cannot be violated at any particular instant. Only by examining an entire infinite behavior can we tell that the clock has stopped ticking, or that a message is never sent.

### 8.1 Temporal Formulas

To specify liveness properties we must learn to express them as temporal formulas. We now take a more rigorous look at what a temporal formula means.

Recall that a state assigns a value to every variable, and a behavior is an infinite sequence of states. A temporal formula is true or false of a behavior. Let $\sigma \models F$ be the truth value of the formula $F$ for the behavior $\sigma$, so $\sigma$ satisfies $F$ iff $\sigma \models F$ equals true. To define the meaning of a temporal formula $F$, we have to explain how to determine the value of $\sigma \models F$ for any behavior $\sigma$. For now, we consider only temporal formulas that don't contain the temporal existential quantifier $\boldsymbol{\exists}$.

It's easy to define the meaning of a Boolean combination of temporal formulas in terms of the meanings of those formulas. The formula $F \wedge G$ is true of a behavior $\sigma$ iff both $F$ and $G$ are true of $\sigma$, and $\neg F$ is true of $\sigma$ iff $F$ is false for $\sigma$ :

$$
\sigma \models(F \wedge G) \triangleq(\sigma \models F) \wedge(\sigma \models G) \quad \sigma \models \neg F \triangleq \neg(\sigma \models F)
$$

This defines the meanings of $\wedge$ and $\neg$ as operators on temporal formulas. The meanings of the other Boolean operators are similarly defined. In the same way, we can define the ordinary predicate-logic quantifiers $\forall$ and $\exists$ as operators on temporal formulas-for example:

$$
\sigma \models(\exists r: F) \triangleq \exists r:(\sigma \models F)
$$

We will discuss quantifiers in Section 8.6. For now, we ignore quantification in temporal formulas.

All the temporal formulas not containing $\boldsymbol{3}$ that we've seen have been Boolean combinations of the following three simple kinds of formulas. (Recall the definitions of state function and state predicate in Section 4.1.)

- A state predicate. It is interpreted as a temporal formula that is true of a behavior iff it is true in the first state of the behavior.
- A formula $\square P$, where $P$ is a state predicate. It is true of a behavior iff $P$ is true in every state of the behavior.
- A formula $\square[N]_{v}$, where $N$ is an action and $v$ is a state function. It is true of a behavior iff every successive pair of steps in the behavior is a $[N]_{v}$ step.

Since a state predicate is an action that contains no primed variables, we can both combine and generalize these three kinds of temporal formulas into the two kinds of formulas $A$ and $\square A$, where $A$ is an action.

Generalizing from state functions, we interpret an arbitrary action $A$ as a temporal formula by defining $\sigma \models A$ to be true iff the first two states of $\sigma$ are an $A$ step. For any behavior $\sigma$, let $\sigma_{0}, \sigma_{1}, \ldots$ be the sequence of states that make up $\sigma$. Then the meaning of an action $A$ as a temporal formula is defined by letting $\sigma \models A$ be true iff the pair $\left\langle\sigma_{0}, \sigma_{1}\right\rangle$ of states is an $A$ step. We define $\sigma \models \square A$ to be true iff, for all $n \in N a t$, the pair $\left\langle\sigma_{n}, \sigma_{n+1}\right\rangle$ of states is an $A$ step. We now generalize this to the definition of $\sigma \models \square F$ for an arbitrary temporal formula $F$.

For any behavior $\sigma$ and natural number $n$, let $\sigma^{+n}$ be the suffix of $\sigma$ obtained by deleting its first $n$ states:

$$
\sigma^{+n} \triangleq \sigma_{n}, \sigma_{n+1}, \sigma_{n+2}, \ldots
$$

The successive pair of states $\left\langle\sigma_{n}, \sigma_{n+1}\right\rangle$ of $\sigma$ is the first pair of states of $\sigma^{+n}$, and $\left\langle\sigma_{n}, \sigma_{n+1}\right\rangle$ is an $A$ step iff $\sigma^{+n}$ satisfies $A$. In other words:

$$
(\sigma \models \square A) \equiv \forall n \in N a t: \sigma^{+n} \models A
$$

So, we can generalize the definition of $\sigma \models \square A$ to

$$
\sigma \models \square F \triangleq \forall n \in N a t: \sigma^{+n} \models F
$$

for any temporal formula $F$. In other words, $\sigma$ satisfies $\square F$ iff every suffix $\sigma^{+n}$ of $\sigma$ satisfies $F$. This defines the temporal operator $\square$.

We have now defined the meaning of any temporal formula built from actions (including state predicates), Boolean operators, and the $\square$ operator. For example:

$$
\begin{array}{rlr}
\sigma \models \square((x=1) \Rightarrow \square(y>0)) & \\
& \equiv \forall n \in N a t: \sigma^{+n} \models((x=1) \Rightarrow \square(y>0)) & \text { By the meaning of } \square . \\
\equiv \forall n \in N a t:\left(\sigma^{+n} \models(x=1)\right) \Rightarrow\left(\sigma^{+n} \models \square(y>0)\right) & \text { By the meaning of } \Rightarrow . \\
\equiv \forall n \in N a t:\left(\sigma^{+n} \models(x=1)\right) \Rightarrow & \\
& \left(\forall m \in N a t:\left(\sigma^{+n}\right)^{+m} \models(y>0)\right) & \text { By the meaning of } \square .
\end{array}
$$

Thus, $\sigma \models \square((x=1) \Rightarrow \square(y>0))$ is true iff, for all $n \in N a t$, if $x=1$ is true in state $\sigma_{n}$, then $y>0$ is true in all states $\sigma_{n+m}$ with $m \geq 0$.

We saw in Section 3.2 that a specification should allow stuttering steps-ones that leave unchanged all the variables appearing in the formula. A stuttering step represents a change only to some part of the system not described by the formula; adding it to the behavior should not affect the truth of the formula. We say that a formula $F$ is invariant under stuttering ${ }^{18}$ if adding a stuttering step to a behavior $\sigma$ does not affect whether $F$ is true of $\sigma$. (This implies that removing a stuttering step from $\sigma$ also does not affect the truth of $\sigma \models F$.) A sensible formula should be invariant under stuttering. There's no point writing formulas that aren't sensible, so TLA ${ }^{+}$allows you to write only temporal formulas that are invariant under stuttering.

[^13]An arbitrary action, viewed as a temporal formula, is not invariant under stuttering. The action $x^{\prime}=x+1$ is true for a behavior in which $x$ is incremented by 1 in the first step; adding an initial stuttering step makes it false. A state predicate is invariant under stuttering, since its truth depends only on the first state of a behavior, and adding a stuttering step doesn't change the first state. The formula $\square[N]_{v}$ is also invariant under stuttering, for any action $N$ and state function $v$. It's not hard to see that invariance under stuttering is preserved by $\square$ and by the Boolean operators. So, state predicates, formulas of the form $\square[N]_{v}$, and all formulas obtainable from them by applying $\square$ and Boolean operators are invariant under stuttering. For now, let's take these to be our temporal formulas. (Later, we'll add quantification.)

To understand temporal formulas intuitively, think of $\sigma_{n}$ as the state of the universe at time instant $n$ during the behavior $\sigma .{ }^{19}$ For any state predicate $P$, the expression $\sigma^{+n} \models P$ asserts that $P$ is true at time instant $n$. Thus, $\square((x=1) \Rightarrow \square(y>0))$ asserts that any time $x=1$ is true, $y>0$ is true from then on. For an arbitrary temporal formula $F$, we also interpret $\sigma^{+n} \models F$ as the assertion that $F$ is true at time instant $n$. The formula $\square F$ then asserts that $F$ is true at all times. We can therefore read $\square$ as always or henceforth or from then on.

We now examine five especially important classes of formulas that are constructed from arbitrary temporal formulas $F$ and $G$. We introduce new operators for expressing the first three.
$\diamond F$ is defined to equal $\neg \square \neg F$. It asserts that $F$ is not always false, which means that $F$ is true at some time:

$$
\begin{aligned}
\sigma \models \diamond F & \equiv \sigma \models \neg \square \neg F & & \text { By definition of } \diamond . \\
& \equiv \neg(\sigma \models \square \neg F) & & \text { By the meaning of } \neg . \\
& \equiv \neg\left(\forall n \in \text { Nat }: \sigma^{+n} \models \neg F\right) & & \text { By the meaning of } \square . \\
& \equiv \neg\left(\forall n \in \text { Nat }: \neg\left(\sigma^{+n} \models F\right)\right) & & \text { By the meaning of } \neg . \\
& \equiv \exists n \in \text { Nat }: \sigma^{+n} \models F & & \text { Because } \neg \neg \neg \text { is equivalent to } \exists .
\end{aligned}
$$

We usually read $\diamond$ as eventually, taking eventually to include now.
$F \leadsto G$ is defined to equal $\square(F \Rightarrow \diamond G)$. The same kind of calculation we've done above shows

$$
\sigma \models(F \leadsto G) \equiv \forall n \in N a t:\left(\sigma^{+n} \models F\right) \Rightarrow\left(\exists m \in N a t:\left(\sigma^{+(n+m)} \models G\right)\right)
$$

The formula $F \leadsto G$ asserts that whenever $F$ is true, $G$ is eventually true - that is, true then or at some later time. We read $\leadsto$ as leads to.
$\diamond\langle A\rangle_{v}$ is defined to equal $\neg \square[\neg A]_{v}$, where $A$ is an action and $v$ a state function. It asserts that not every step is a $(\neg A) \vee\left(v^{\prime}=v\right)$ step, so some step is a $\neg\left((\neg A) \vee\left(v^{\prime}=v\right)\right)$ step. But $\neg\left((\neg A) \vee\left(v^{\prime}=v\right)\right)$ is equivalent to $A \wedge\left(v^{\prime} \neq v\right)$, so $\diamond\langle A\rangle_{v}$ asserts that some step is an $A \wedge\left(v^{\prime} \neq v\right)$ step. We define $\langle A\rangle_{v}$ to equal $A \wedge\left(v^{\prime} \neq v\right)$, so $\diamond\langle A\rangle_{v}$ asserts that eventually an $\langle A\rangle_{v}$ step occurs. We think of $\diamond\langle A\rangle_{v}$ as the formula obtained by applying the operator $\diamond$ to $\langle A\rangle_{v}$, although technically it's not because $\langle A\rangle_{v}$ isn't a temporal formula.
$\square \diamond F$ asserts that at all times, $F$ is true then or at some later time. For time instant 0 , this implies that $F$ is true at some time instant $n_{0} \geq 0$. For time instant $n_{0}+1$, it implies that $F$ is true at some time instant $n_{1} \geq n_{0}+1$. For time instant $n_{1}+1$, it implies

[^14]that $F$ is true at some time instant $n_{2} \geq n_{1}+1$. Continuing the process, we see that $F$ is true at an infinite sequence of time instants $n_{0}, n_{1}, n_{2}, \ldots . S$ So, $\square \diamond F$ asserts that $F$ is infinitely often true. In particular, $\square \diamond\langle A\rangle_{v}$ asserts that infinitely many $\langle A\rangle_{v}$ steps occur.
$\diamond \square F$ asserts that eventually (at some time), $F$ becomes true and remains true thereafter. In other words, $\diamond \square F$ asserts that $F$ is eventually always true. In particular, $\diamond \square[N]_{v}$ asserts that eventually, every step is a $[N]_{v}$ step.

The operators $\square$ and $\diamond$ have higher precedence (bind more tightly) than the Boolean operators, so $\diamond F \vee \square G$ means $(\diamond F) \vee(\square G)$. The operator $\leadsto$ has the same precedence as $\Rightarrow$.

### 8.2 Weak Fairness

Using the temporal operators $\square$ and $\diamond$, it's easy to specify liveness properties. For example, consider the hour-clock specification of module HourClock in Figure 1. We can require that the clock never stops by asserting that there must be infinitely many HCnxt steps. This is expressed by the formula $\square \diamond\langle H C n x t\rangle_{h r}$. (The $\left\rangle_{h r}\right.$ is needed to satisfy the syntax rules for temporal formulas; it's discussed in the next paragraph.) So, formula $H C \wedge \square \diamond\langle H C n x t\rangle_{h r}$ specifies a clock that never stops.

The syntax rules of TLA require us to write $\square \diamond\langle H C n x t\rangle_{h r}$ instead of the more obvious formula $\square \diamond H C n x t$. These rules are needed to guarantee that all TLA formulas are invariant under stuttering. In a behavior satisfying $H C$, an $H C n x t$ step necessarily changes $h r$, so it is necessarily an $\langle H C n x t\rangle_{h r}$ step. Hence, $H C \wedge \square \diamond\langle H C n x t\rangle_{h r}$ is equivalent to the (illegal) formula $H C \wedge \square \diamond H C n x t$.

In a similar fashion, most of the actions we define do not allow stuttering steps. When we write $\langle A\rangle_{v}$ for some action $A$, the $\left\rangle_{v}\right.$ is usually needed only to satisfy the syntax rules. To avoid having to think about which variables $A$ actually changes, we generally take the subscript $v$ to be the tuple of all variables, which is changed if any variable changes. I will usually ignore the angle brackets and subscripts in informal discussions, and will describe $\square \diamond\langle H C n x t\rangle_{h r}$ as the assertion that there are infinitely many HCnxt steps.

Let's now modify specification Spec of module Channel (Figure 3) to require that every value sent is eventually received. We do this by conjoining a liveness condition to Spec. The analog of the liveness condition for the clock would be $\square \diamond\langle R c v\rangle_{c h a n}$, which asserts that there are infinitely many $R c v$ steps. However, this would also require that infinitely many values are sent, and we don't want to make that requirement. In fact, we want to permit behaviors in which no value is ever sent, so no value is ever received. We just want to require that, if a value is ever sent, then it is eventually received.

It's enough to require only that the next value to be received always is eventually received, since this implies that all values sent are eventually received. More precisely, we require that it's always the case that, if there is a value to be received, then the next value to be received eventually is received. The next value is received by a $R c v$ step, so the requirement is: ${ }^{20}$

$$
\square\left(\text { There is an unreceived value } \Rightarrow \diamond\langle R c v\rangle_{\text {chan }}\right)
$$

There is an unreceived value iff action $R c v$ is enabled. (Recall that $R c v$ is enabled in a state iff it is possible to take a Rcv step from that state.) TLA ${ }^{+}$defines Enabled $A$ to be the predicate that is true iff action $A$ is enabled. The liveness condition can then be written as:

$$
\begin{equation*}
\square\left(\operatorname{EnABLED}\langle R c v\rangle_{c h a n} \Rightarrow \diamond\langle R c v\rangle_{c h a n}\right) \tag{13}
\end{equation*}
$$

[^15]In the Enabled formula, it doesn't matter if we write $R c v$ or $\langle R c v\rangle_{c h a n}$. We add the angle brackets so the two actions appearing in the formula are the same.

Because $\langle H C n x t\rangle_{h r}$ is always enabled during any behavior satisfying $H C$, we can rewrite the liveness condition $\square \diamond\langle H C n x t\rangle_{h r}$ for the hour clock as:

$$
\square\left(\text { ENABLED }\langle H C n x t\rangle_{h r} \Rightarrow \diamond\langle H C n x t\rangle_{h r}\right)
$$

This suggests a general liveness condition on action $A$ :

$$
\square\left(\operatorname{ENABLED}\langle A\rangle_{v} \Rightarrow \diamond\langle A\rangle_{v}\right)
$$

This formula asserts that if $A$ ever becomes enabled, then an $A$ step will eventually occureven if $A$ remains enabled for only a fraction of a nanosecond, and is never again enabled. The obvious difficulty of physically implementing such a requirement suggests that it's too strong. Instead, we define the weaker formula $\mathrm{WF}_{v}(A)$ to equal:

$$
\begin{equation*}
\square\left(\square \mathrm{ENABLED}\langle A\rangle_{v} \Rightarrow \diamond\langle A\rangle_{v}\right) \tag{14}
\end{equation*}
$$

This formula asserts that if $A$ ever becomes forever enabled, then an $A$ step must eventually occur. WF stands for $W$ eak $F$ airness, and the condition $\mathrm{WF}_{v}(A)$ is called weak fairness on $A$. Here are two formulas that are each equivalent to (14):

$$
\begin{align*}
& \square \diamond\left(\neg \operatorname{ENABLED}\langle A\rangle_{v}\right) \vee \square \diamond\langle A\rangle_{v}  \tag{15}\\
& \diamond \square\left(\operatorname{ENABLED}\langle A\rangle_{v}\right) \Rightarrow \square \diamond\langle A\rangle_{v} \tag{16}
\end{align*}
$$

These three formulas can be expressed in English as:
14. It's always the case that, if $A$ is enabled forever, then an $A$ step eventually occurs.
15. $A$ is infinitely often disabled, or infinitely many $A$ steps occur.
16. If $A$ is eventually enabled forever, then infinitely many $A$ steps occur.

The equivalence of these three formulas isn't obvious. Here's a proof that (14) is equivalent to (15), using some simple tautologies. Studying this proof, and these tautologies, will help you understand how to write liveness conditions.

$$
\begin{aligned}
& \square \\
& \left(\square \operatorname{ENABLED}\langle A\rangle_{v} \Rightarrow \diamond\langle A\rangle_{v}\right) \\
& \\
& \quad \equiv \square\left(\neg \square \operatorname{EnABLED}\langle A\rangle_{v} \vee \diamond\langle A\rangle_{v}\right) \quad \text { Because }(F \Rightarrow G) \equiv(\neg F \vee G) . \\
& \quad \equiv \square\left(\diamond \neg \operatorname{EnABLED}\langle A\rangle_{v} \vee \diamond\langle A\rangle_{v}\right) \quad \text { Because } \neg \square F \equiv \diamond \neg F . \\
& \quad \equiv \square \diamond\left(\neg \operatorname{EnABLED}\langle A\rangle_{v} \vee\langle A\rangle_{v}\right) \quad \text { Because } \diamond(F \vee G) \equiv \diamond F \vee \diamond G . \\
& \quad \equiv \square \diamond\left(\neg \operatorname{EnABLED}\langle A\rangle_{v}\right) \vee \square \diamond\langle A\rangle_{v} \quad \text { Because } \square \diamond(F \vee G) \equiv \square \diamond F \vee \square \diamond G .
\end{aligned}
$$

The equivalence of (15) and (16) is proved as follows

$$
\begin{aligned}
& \square \diamond\left(\neg \operatorname{ENABLED}\langle A\rangle_{v}\right) \vee \square \diamond\langle A\rangle_{v} \\
& \quad \equiv \neg \diamond \square \operatorname{EnABLED}\langle A\rangle_{v} \vee \square \diamond\langle A\rangle_{v} \quad \text { Because } \square \diamond \neg F \equiv \square \neg \square F \equiv \neg \diamond \square F . \\
& \quad \equiv \diamond \square \operatorname{EnABLED}\langle A\rangle_{v} \Rightarrow \square \diamond\langle A\rangle_{v} \quad \text { Because }(F \Rightarrow G) \equiv(\neg F \vee G) .
\end{aligned}
$$

We now show that the liveness conditions for the hour clock and the channel can be written as weak fairness conditions.

First, consider the hour clock. In any behavior satisfying its safety specification $H C$, an $\langle H C n x t\rangle_{h r}$ step is always enabled, so $\square \diamond\left(\right.$ EnABLED $\left.\langle H C n x t\rangle_{h r}\right)$ equals TRUE. Hence, $H C$ implies that $\mathrm{WF}_{h r}(H C n x t)$ is equivalent to $\square \diamond\langle H C n x t\rangle_{h r}$, our liveness condition for the hour clock.

Now, consider the liveness condition (13) for the channel. By (15), weak fairness on $R c v$ asserts that either (a) $R c v$ is disabled infinitely often, or (b) infinitely many $R c v$ steps occur (or both). Suppose $R c v$ becomes enabled at some instant. In case (a), Rcv must subsequently be disabled, which can occur only by a $R c v$ step. Case (b) also implies that there is a subsequent $R c v$ step. Weak fairness therefore implies that it's always the case that if $R c v$ is enabled, then a $R c v$ step eventually occurs. A closer look at this reasoning reveals that it is an informal proof of:

$$
\left.S p e c \Rightarrow\left(\mathrm{WF}_{c h a n}(R c v) \Rightarrow \square\left(\operatorname{ENABLED}^{\langle R c v}\right\rangle_{c h a n} \Rightarrow \diamond\langle R c v\rangle_{c h a n}\right)\right)
$$

Because $\square F$ implies $F$, for any formula $F$, it's not hard to check the truth of:

$$
\left.\square\left(\operatorname{ENABLED}\langle R c v\rangle_{c h a n} \Rightarrow \diamond\langle R c v\rangle_{c h a n}\right)\right) \Rightarrow \mathrm{WF}_{c h a n}(R c v)
$$

Therefore, Spec $\wedge \mathrm{WF}_{\text {chan }}(R c v)$ is equivalent to the conjunction of Spec and (13), so weak fairness of $R c v$ specifies the same liveness condition as (13) for the channel.

### 8.3 Liveness for the Memory Specification

Let's now strengthen the memory specification with the liveness requirement that every request must receive a response. (We don't require that a request is ever issued.) The liveness requirement is conjoined to the internal memory specification, formula ISpec of module InternalMemory (Figure 8).

We will express the liveness requirement in terms of weak fairness. This requires understanding when actions are enabled. The action $R s p(p)$ is enabled only if the action

$$
\begin{equation*}
\operatorname{Reply}\left(p, \text { buf }[p], \text { memInt, memInt }{ }^{\prime}\right) \tag{17}
\end{equation*}
$$

is enabled. Recall that the operator Reply is a constant parameter, declared in module MemoryInterface (Figure 7). Without knowing more about this operator, we can't say when action (17) is enabled.

We assume that Reply actions are always enabled. That is, for any processor $p$ and reply $r$, and any old value miOld of memInt, there is a new value miNew of memInt such that $\operatorname{Repl}(p, r, m i O l d, m i N e w)$ is true. For simplicity, we can just assume that this is true for all $p$ and $r$, and add the following assumption to the MemoryInterface module:

$$
\text { ASSUME } \forall p, r, m i O l d: \exists m i N e w: \operatorname{Repl}(p, r, m i O l d, m i N e w)
$$

We should also make a similar assumption for $S e n d$, but we don't need it here.
We will subscript our weak-fairness formulas with the tuple of all variables, so let's add the following definition to the InternalMemory module:

$$
\text { vars } \triangleq\langle m e m I n t, m e m, c t l, b u f\rangle
$$

When processor $p$ issues a request, it enables the $D o(p)$ action, which remains enabled until a $D o(p)$ step occurs. The weak-fairness condition $\mathrm{WF}_{\text {vars }}(D o(p))$ implies that this $D o(p)$ step must eventually occur. A $D o(p)$ step enables the $R s p(p)$ action, which remains enabled until an $R s p(p)$ step occurs. The weak-fairness condition $\mathrm{WF}_{v a r s}(R s p(p))$ implies that this $R s p(p)$ step, which produces the desired response, must eventually occur. Hence, the requirement
(18) $\mathrm{WF}_{\text {vars }}(\operatorname{Do}(p)) \wedge \mathrm{WF}_{\text {vars }}(R s p(p))$
assures that every request issued by processor $p$ must eventually receive a reply.
We can rewrite condition (18) in the slightly simpler form of weak fairness on the action $D o(p) \vee R s p(p)$. The disjunction of two actions is enabled iff one or both of them are enabled. A Req $(p)$ step enables $D o(p)$, thereby enabling $D o(p) \vee R s p(p)$. The only $D o(p) \vee R s p(p)$
step then possible is a $D o(p)$ step, which enables $R s p(p)$ and hence $D o(p) \vee R s p(p)$. At this point, the only $\operatorname{Do}(p) \vee R s p(p)$ step possible is a $\operatorname{Rsp}(p)$ step, which disables $R s p(p)$ and leaves $\operatorname{Do}(p)$ disabled, hence disabling $D o(p) \vee \operatorname{Rsp}(p)$. This all shows that (18) is equivalent to $\mathrm{WF}_{\text {vars }}(\operatorname{Do}(p) \vee R s p(p))$, weak fairness on the single action $\operatorname{Do}(p) \vee R s p(p)$.

Weak fairness of $D o(p) \vee R s p(p)$ guarantees that every request by processor $p$ receives a response. We want every request from every processor to receive a response. So, the liveness condition for the memory specification asserts weak fairness of $\operatorname{Do}(p) \vee \operatorname{Rsp}(p)$ for every processor $p:{ }^{21}$

$$
\text { Liveness } \triangleq \forall p \in \text { Proc }: \mathrm{WF}_{\text {vars }}(D o(p) \vee R s p(p))
$$

The example of actions $\operatorname{Do}(p)$ and $R s p(p)$ raises the general question: when is the conjunction of weak fairness on actions $A_{1}, \ldots, A_{n}$ equivalent to weak fairness of their disjunction $A_{1} \vee \ldots \vee A_{n}$ ? The general answer is complicated, but here's a sufficient condition:

WF Conjunction Rule If $A_{1}, \ldots, A_{n}$ are actions such that, for any distinct $i$ and $j$, whenever action $A_{i}$ is enabled, action $A_{j}$ cannot become enabled until an $A_{i}$ step occurs, then $\mathrm{WF}_{v}\left(A_{1}\right) \wedge \ldots \wedge \mathrm{WF}_{v}\left(A_{n}\right)$ is equivalent to $\mathrm{WF}_{v}\left(A_{1} \vee \ldots \vee A_{n}\right)$.

This rule is stated rather informally. It can be interpreted as an assertion about a particular behavior $\sigma$, in which case its conclusion is

$$
\sigma \models\left(\mathrm{WF}_{v}\left(A_{1}\right) \wedge \ldots \wedge \mathrm{WF}_{v}\left(A_{n}\right)\right) \equiv \mathrm{WF}_{v}\left(A_{1} \vee \ldots \vee A_{n}\right)
$$

Alternatively, it can be formalized as the assertion that if a formula $F$ implies the hypothesis, then $F$ implies the equivalence of $\mathrm{WF}_{v}\left(A_{1}\right) \wedge \ldots \wedge \mathrm{WF}_{v}\left(A_{n}\right)$ and $\mathrm{WF}_{v}\left(A_{1} \vee \ldots \vee A_{n}\right)$.

Conjunction and disjunction are special cases of universal and existential quantification, respectively. For example, $A_{1} \vee \ldots \vee A_{n}$ is equivalent to $\exists i \in 1 \ldots n: A_{i}$. So, we can trivially restate the WF Conjunction Rule as a condition on when formulas $\forall i \in S: \mathrm{WF}_{v}\left(A_{i}\right)$ and $\mathrm{WF}_{v}\left(\exists i \in S: A_{i}\right)$ are equivalent, for a finite set $S$. The resulting rule is actually valid for any set $S$ :

WF Quantifier Rule If the $A_{i}$ are actions, for all $i \in S$, such that, for any distinct $i$ and $j$ in $S$, whenever action $A_{i}$ is enabled, action $A_{j}$ cannot become enabled until an $A_{i}$ step occurs, then $\forall i \in S: \mathrm{WF}_{v}\left(A_{i}\right)$ and $\mathrm{WF}_{v}\left(\exists i \in S: A_{i}\right)$ are equivalent.

### 8.4 Strong Fairness

Formulations (15) and (16) of $\mathrm{WF}_{v}(A)$ contain the operators infinitely often $(\square \diamond)$ and eventually always $(\diamond \square)$. Eventually always is stronger than (implies) infinitely often. We define $\mathrm{SF}_{v}(A)$, strong fairness of action $A$, to be either of the following equivalent formulas:

$$
\begin{align*}
& \diamond \square\left(\neg \operatorname{EnABLED}\langle A\rangle_{v}\right) \vee \square \diamond\langle A\rangle_{v}  \tag{19}\\
& \square \diamond \operatorname{EnABLED}\langle A\rangle_{v} \Rightarrow \square \diamond\langle A\rangle_{v} \tag{20}
\end{align*}
$$

Intuitively, these two formulas assert:
(19) $A$ is eventually disabled forever, or infinitely many $A$ steps occur.
(20) If $A$ is infinitely often enabled, then infinitely many $A$ steps occur.

[^16]The proof that these two formulas are equivalent is similar to the proof of equivalence of (15) and (16).

The analogs of the WF Conjunction and WF Quantifier Rules hold for strong fairness-for example:

SF Conjunction Rule If $A_{1}, \ldots, A_{n}$ are actions such that, for any distinct $i$ and $j$, whenever action $A_{i}$ is enabled, action $A_{j}$ cannot become enabled until an $A_{i}$ step occurs, then $\mathrm{SF}_{v}\left(A_{1}\right) \wedge \ldots \wedge \mathrm{SF}_{v}\left(A_{n}\right)$ is equivalent to $\mathrm{SF}_{v}\left(A_{1} \vee \ldots \vee A_{n}\right)$.

It's not hard to see that strong fairness is stronger than weak fairness-that is, $\mathrm{SF}_{v}(A)$ implies $\mathrm{WF}_{v}(A)$, for any $v$ and $A$. We can express weak and strong fairness as follows.

- Weak fairness of $A$ asserts that an $A$ step must eventually occur if $A$ is continuously enabled.
- Strong fairness of $A$ asserts that an $A$ step must eventually occur if $A$ is continually enabled.

Continuously means without interruption. Continually means repeatedly, possibly with interruptions.

Strong fairness need not be strictly stronger than weak fairness. Weak and strong fairness of an action $A$ are equivalent iff $A$ infinitely often disabled implies that either $A$ is eventually always disabled, or infinitely many $A$ steps occur. This is expressed formally by the tautology:

$$
\left(\mathrm{WF}_{v}(A) \equiv \operatorname{SF}_{v}(A)\right) \equiv\left(\square \diamond\left(\neg \operatorname{EnABLED}\langle A\rangle_{v}\right) \Rightarrow \diamond \square\left(\neg \operatorname{EnABLED}\langle A\rangle_{v}\right) \vee \square \diamond\langle A\rangle_{v}\right)
$$

In the channel example, weak and strong fairness of $R c v$ are equivalent because Spec implies that, once enabled, $R c v$ can be disabled only by a $R c v$ step; so if it is disabled infinitely often, then it either eventually remains disabled forever, or else it is disabled infinitely often by $R c v$ steps.

Strong fairness can be more difficult to implement than weak fairness, and it is a less common requirement. A strong fairness condition should be used in a specification only if it is needed. When strong and weak fairness are equivalent, the fairness property should be written as weak fairness.

Liveness properties can be subtle. Expressing them with ad hoc temporal formulas can lead to errors. We will specify liveness as the conjunction of fairness properties whenever possible - and it almost always is possible. Having a uniform way of expressing liveness makes specifications easier to understand. Section 8.7.2 discusses an even more compelling reason for using fairness to specify liveness.

### 8.5 Liveness for the Write-Through Cache

Let's now add liveness to the write-through cache, specified in Figures 10 and 11. We want our specification to guarantee that every request eventually receives a response, without requiring that any requests are issued. This requires fairness on all the actions that make up the nextstate action Next except the $\operatorname{Req}(p)$ action (which issues a request) and the $\operatorname{Evict}(p, a)$ action (which evicts an address from the cache). If any other action were ever enabled without being executed, then some request might not generate a response - except for one special case. If the $m e m Q$ queue contains only write requests, and $m e m Q$ is not full (has fewer than $Q L e n$ elements), then not executing a $\operatorname{Mem} Q W r$ action would not prevent any responses. (Remember that a response to a write request can be issued before the value is written to memory.) We'll return to this exception later. For simplicity, let's require fairness for the MemQWr action too.

Our liveness condition has to assert fairness of the following actions:

$$
\operatorname{MemQWr} \quad \operatorname{MemQRd} \quad \operatorname{Rsp}(p) \quad \operatorname{RdMiss}(p) \quad \operatorname{DoRd}(p) \quad \operatorname{DoWr}(p)
$$

for all $p$ in Proc. We now must decide whether to assert weak or strong fairness for these actions. Weak and strong fairness are equivalent for an action that, once enabled, remains enabled until it is executed. This is the case for all of these actions except $\operatorname{RdMiss}(p)$ and $\operatorname{Do} W r(p)$. These two actions append a request to the $\operatorname{mem} Q$ queue, and are disabled if that queue is full. A $\operatorname{RdMiss}(p)$ or $\operatorname{DoWr}(p)$ could be enabled, and then become disabled because a RdMiss $(q)$ or $\operatorname{DoWr}(q)$, for a different processor $q$, appends a request to memQ. We therefore need strong fairness for the $\operatorname{RdMiss}(p)$ and $\operatorname{DoWr}(p)$ actions. So, the fairness conditions we need are:

## Weak Fairness $\quad \operatorname{Rsp}(p), \operatorname{DoRd}(p), \operatorname{MemQWr}$, and MemQRd

Strong Fairness RdMiss(p) and $\operatorname{DoWr}(p)$
As before, let's define vars to be the tuple of all variables.

$$
\text { vars } \triangleq\langle\text { memInt, mem, buf, ctl, cache, memQ }\rangle
$$

We could just write the liveness condition as

$$
\begin{align*}
& \wedge \forall p \in \operatorname{Proc}: \wedge \operatorname{WF}_{\text {vars }}(\operatorname{Rsp}(p)) \wedge \mathrm{WF}_{\text {vars }}(\operatorname{DoRd}(p))  \tag{21}\\
& \wedge \operatorname{SF}_{\text {vars }}(R d M i s s(p)) \wedge \mathrm{SF}_{\text {vars }}(\operatorname{DoWr}(p)) \\
& \wedge \mathrm{WF}_{\text {vars }}(\operatorname{MemQWr}) \wedge \mathrm{WF}_{\text {vars }}(\operatorname{MemQRd})
\end{align*}
$$

However, I prefer replacing the conjunction of fairness conditions by a single fairness condition on a disjunction, as we did in Section 8.3 for the memory specification. The WF and SF Conjunction Rules (page 8.3 and 8.4 ) easily imply that the liveness condition (21) can be rewritten as

$$
\begin{align*}
& \wedge \forall p \in \operatorname{Proc}: \wedge \mathrm{WF}_{\text {vars }}(\operatorname{Rsp}(p) \vee \operatorname{DoRd}(p))  \tag{22}\\
& \wedge \mathrm{SF}_{\text {vars }}(\operatorname{RdMiss}(p) \vee \operatorname{DoWr}(p)) \\
& \wedge \mathrm{WF}_{\text {vars }}(\operatorname{MemQWr} \vee \operatorname{MemQRd})
\end{align*}
$$

We can now try to simplify (22) by applying the WF Quantifier Rule (Section 8.3) to replace $\forall p \in \operatorname{Proc}: \mathrm{WF}_{\text {vars }}(\ldots)$ with $\mathrm{WF}_{\text {vars }}(\exists p \in \operatorname{Proc}: \ldots)$. However, that rule doesn't apply; it's possible for $R s p(p) \vee \operatorname{DoRd}(p)$ to be enabled for two different processors $p$ at the same time. In fact, the two formulas are not equivalent. Similarly, the analogous rule for strong fairness doesn't apply. Formula (22) is as simple as we can make it.

Let's return to the observation that we don't have to execute $M e m Q W r$ if the memQ queue is not full and contains only write requests. Let's define $Q$ Cond to be the assertion that $m e m Q$ is not full and contains only write requests:

$$
\begin{aligned}
Q \operatorname{Cond} \triangleq & \wedge \operatorname{Len}(\operatorname{mem} Q)<Q \operatorname{Len} \\
& \wedge \forall i \in 1 . . \operatorname{Len}(\operatorname{mem} Q): \operatorname{mem} Q[i][2] . o p=" \mathrm{~W} \mathrm{r} "
\end{aligned}
$$

We have to eventually execute a MemQWr action only when it's enabled and $Q C o n d$ is true, which is the case iff the action $Q C o n d \wedge M e m Q W r$ is enabled. In this case, a $M e m Q W r$ step is a $Q C o n d \wedge M e m Q W r$ step. Hence, it suffices to require weak fairness of the action $Q C o n d \wedge M e m Q W r$. We can therefore replace the second conjunct of (22) with

$$
\mathrm{WF}_{\text {vars }}((Q \operatorname{Cond} \wedge \operatorname{MemQWr}) \vee M e m Q R d)
$$

We would do this if we wanted the specification to describe the weakest liveness condition that implements the memory specification's liveness condition. However, if the specification were a description of an actual device, then that device would probably implement weak fairness on all $M e m Q W r$ actions, so we would take (22) as the liveness condition.

### 8.6 Quantification

I've already mentioned, in Section 8.1, that the ordinary quantifiers of predicate logic can be applied to temporal formulas. For example, the meaning of the formula $\exists r: F$, for any temporal formula $F$, is given by

$$
\sigma \models(\exists r: F) \triangleq \exists r:(\sigma \models F)
$$

where $\sigma$ is any behavior. The meaning of $\forall r: F$ is similarly defined.
The symbol $r$ in $\exists r: F$ is usually called a bound variable. But we've been using the term variable to mean something else -something that's declared by a variable statement in a module. The bound "variable" $r$ is actually a constant in these formulas-a value that is the same in every state of the behavior. ${ }^{22}$ For example, the formula $\exists r: \square(x=r)$ asserts that $x$ always has the same value.

The bounded quantifiers are defined in a similar way-for example,

$$
\sigma \models(\exists r \in S: F) \triangleq \exists r \in S:(\sigma \models F)
$$

For this formula to make sense, $S$ must be a constant. ${ }^{23}$ The symbol $r$ is declared to be a constant in formula $F$. The expression $S$ lies outside the scope of the declaration of $r$, so the symbol $r$ cannot occur in $S$.

One can, in a similar way, define choose to be a temporal operator. However, it's not needed for writing specifications, so we won't.

We have been using the operator $\boldsymbol{\exists}$ as a hiding operator. Intuitively, $\boldsymbol{\exists} x: F$ means $F$ with variable $x$ hidden. In this formula, $x$ is declared to be a variable in formula $F$. Unlike $\exists r: F$, which asserts the existence of a single value $r$, the formula $\exists x: F$ asserts the existence of a value for $x$ in each state of a behavior.

The precise definition of $\boldsymbol{\exists}$ is a bit tricky because, as discussed in Section 8.1, the formula $\exists x: F$ should be invariant under stuttering. To define it, we first define $\sigma \sim_{x} \tau$ to be true iff $\sigma$ can be obtained from $\tau$ (or vice-versa) by adding and/or removing stuttering steps and changing the values assigned to $x$ by its states. To define $\sim_{x}$ precisely, we define two behaviors $\sigma$ and $\tau$ to be stuttering-equivalent iff removing all stuttering steps from each of them produces the same sequence of states. Next, let $\sigma_{x \leftarrow 0}$ be the behavior that is the same as $\sigma$ except that, in each state, $x$ is assigned the value $0 .{ }^{24}$ We can then define $\sigma \sim_{x} \tau$ to be true iff $\sigma_{x \leftarrow 0}$ and $\tau_{x \leftarrow 0}$ are stuttering equivalent. Finally, the meaning of $\boldsymbol{\exists}$ is defined by letting $\sigma \models(\boldsymbol{\exists} x: F)$ be true iff there exists some behavior $\tau$ such that $\tau \sim_{x} \sigma$ and $\tau \models F$ are true. If you find this too confusing, don't worry about it. For writing specifications, it suffices to just think of $\exists x: F$ as $F$ with $x$ hidden.

TLA also has a temporal universal quantifier $\boldsymbol{\forall}$, defined by:

$$
\boldsymbol{\forall} x: F \triangleq \neg \boldsymbol{3} x: \neg F
$$

This operator is hardly ever used.
TLA ${ }^{+}$does not allow bounded versions of the operators $\boldsymbol{\exists}$ and $\boldsymbol{\forall}$. If, for some reason, you want to write $\boldsymbol{\exists} x \in S: F$, you can simply write $\boldsymbol{\exists} x:(x \in S) \wedge F$ instead. However, I don't know why anyone would want to write such a formula.

[^17]
### 8.7 Temporal Logic Examined

### 8.7.1 A Review

Let's look at the shapes of the specifications that we've written so far. We started with the simple form

## Init $\wedge \square[\text { Next }]_{\text {vars }}$

where Init is the initial predicate, Next the next-state action, and vars the tuple of all variables. This kind of specification is, in principle, quite straightforward.

We then introduced hiding: using $\exists$ to bind variables that should not appear in the specification. Those bound variables, also called hidden or internal variables, serve only to help describe how the values of the free variables (also called visible variables) change.

Hiding variables is easy enough, and it is mathematically elegant and philosophically satisfying. However, in practice, it doesn't make much difference to a specification. A comment can also tell a reader that a variable should be regarded as internal. Explicit hiding allows implementation to mean implication. A lower-level specification that describes an implementation can be expected to imply a specification only if the specification's internal variables, whose values don't really matter, are explicitly hidden. Otherwise, implementation means implementation under a refinement mapping. (See Section 6.8.)

To express liveness, the specification (23) is strengthened to the form

$$
\begin{equation*}
\text { Init } \wedge \square[\text { Next }]_{\text {vars }} \wedge \text { Liveness } \tag{24}
\end{equation*}
$$

where Liveness is the conjunction of formulas of the form $\mathrm{WF}_{\text {vars }}(A)$ and/or $\operatorname{SF}_{\text {vars }}(A)$, for actions $A$. (I'm considering universal quantification to be a form of conjunction.)

### 8.7.2 Machine Closure

In the specifications of the form (24) we've written so far, the actions $A$ whose fairness properties appear in formula Liveness have one thing in common: they are all subactions of the next-state action Next. An action $A$ is a subaction of Next iff every $A$ step is a Next step. Equivalently, $A$ is a subaction of Next iff $A$ implies Next. In almost all specifications of the form (24), formula Liveness should be the conjunction of weak and/or strong fairness formulas for subactions of Next. I'll now briefly explain why.

When we look at the specification (24), we expect Init to constrain the initial state, Next to constrain what steps may occur, and Liveness to describe only what must eventually happen. However, consider the following formula

$$
\begin{equation*}
(x=0) \wedge \square\left[x^{\prime}=x+1\right]_{x} \wedge \mathrm{WF}_{x}\left((x>99) \wedge\left(x^{\prime}=x-1\right)\right) \tag{25}
\end{equation*}
$$

The first two conjuncts of (25) assert that $x$ is initially 0 and that any step either increments $x$ by 1 or leaves it unchanged. Hence, they imply that if $x$ ever exceeds 99 , then it forever remains greater than 99 . The weak fairness property asserts that, if this happens, then $x$ must eventually be decremented by 1 -contradicting the second conjunct. Hence, (25) implies that $x$ can never exceed 99 , so that formula is equivalent to

$$
(x=0) \wedge \square\left[(x<99) \wedge\left(x^{\prime}=x+1\right)\right]_{x}
$$

Conjoining the weak fairness property to the first two conjuncts of (25) forbids an $x^{\prime}=x+1$ step when $x=99$.

A specification of the form (24) is called machine closed iff the conjunct Liveness does not constrain the initial state or what steps may occur. We almost never want to write a specification that isn't machine closed. If we do write one, it's almost always by mistake. Specification (24) is guaranteed to be machine closed if Liveness is the conjunction of weak
and/or strong fairness properties for subactions of Next. ${ }^{25}$ This condition doesn't apply to specification (25), which is not machine closed, because $(x>99) \wedge\left(x^{\prime}=x-1\right)$ is not a subaction of $x^{\prime}=x+1$.

Liveness requirements are philosophically satisfying. A specification of the form (23), which specifies only a safety property, allows behaviors in which the system does nothing. Therefore, the specification is satisfied by a system that does nothing. Expressing liveness requirements with fairness properties is less satisfying. These properties are subtle and it's easy to get them wrong. It requires some thought to determine that the liveness condition for the write-through cache, formula (22), does imply that every request receives a reply.

It's tempting to express liveness properties more directly, without using fairness properties. For example, it's easy to write a temporal formula asserting for the write-through cache that every request receives a response. When processor $p$ issues a request, it sets $c t l[p]$ to "rdy". We just have to assert that a state in which $\operatorname{ctl}[p]=$ "rdy" is true leads to a $\operatorname{Rsp}(p)$ step-for every processor $p$ :

$$
\begin{equation*}
\forall p \in \operatorname{Proc}:(\operatorname{ctl}[p]=\text { "rdy" }) \leadsto\langle R s p(p)\rangle_{\text {vars }} \tag{26}
\end{equation*}
$$

(The operator $\leadsto$ is defined in Section 8.1.) While such formulas are appealing, they are dangerous. It's very easy to make a mistake and write a specification that isn't machine closed.

Except in unusual circumstances, you should express liveness with fairness properties for subactions of the next-state action. These are the most straightforward specifications, and hence the easiest to write and to understand. Most system specifications, even if very detailed and complicated, can be written in this straightforward manner. The exceptions are usually in the realm of subtle, high-level specifications that attempt to be very general. An example is the specification of sequential consistency in [6].

### 8.7.3 The Unimportance of Liveness

While philosophically important, in practice the liveness property of (24) is not as important as the safety part, Init $\wedge \square[N e x t]_{\text {vars }}$. The ultimate purpose of writing a specification is to avoid errors. Experience shows that most of the benefit from writing and using a specification comes from the safety part. On the other hand, the liveness property is usually easy enough to write. It typically constitutes less than five percent of a specification. So, you might as well write the liveness part. However, when verifying the correctness of the specification, most of your effort should be devoted to the safety part.

### 8.7.4 Temporal Logic Considered Confusing

The most general type of specification I've discussed so far has the form

$$
\begin{equation*}
\exists v_{1}, \ldots, v_{n}: \text { Init } \wedge \square[\text { Next }]_{\text {vars }} \wedge \text { Liveness } \tag{27}
\end{equation*}
$$

where Liveness is the conjunction of fairness properties of subactions of Next. This is a very restricted class of temporal-logic formulas. Temporal logic is quite expressive, and one can combine its operators in all sorts of ways to express a wide variety of properties. This suggests the following approach to writing a specification: express each property that the system must satisfy with a temporal formula, and then conjoin all these formulas. For example, formula (26) above expresses the property of the write-through cache that every request eventually receives a response.

This approach is philosophically appealing. It has just one problem: it's practical for only the very simplest of specifications - and even for them, it seldom works well. The unbridled

[^18]use of temporal logic produces formulas that are hard to understand. Conjoining several of these formulas produces a specification that is impossible to understand.

The basic form of a TLA specification is (27). Most specifications should have this form. We can also use such specifications as building blocks, combining them to form larger specifications, as described in [1]. However, such specifications are of limited practical use. Most engineers need know only how to write specifications of the form (27). Indeed, they can get along quite well with specifications of the form (23).

## 9 Writing a Specification-Some Advice

You have now learned all you need to know about TLA ${ }^{+}$to write your own specifications. Here are a few additional hints to help you get started.

## Why to Specify

Specifications are written to help eliminate errors. Writing a specification requires effort; the benefits it provides must be worth that effort. There are several benefits:

- Writing a TLA ${ }^{+}$specification can help the design process. Having to describe a design precisely often reveals problems - subtle interactions and "corner cases" that are easily overlooked. These problems are easier to correct when discovered in the design phase rather than after implementation has begun.
- $\mathrm{A} \mathrm{TLA}^{+}$specification can provide a clear, concise way of communicating a design. It helps ensure that the designers agree on what they have designed, and it provides a valuable guide to the engineers who implement and test the system. It may also help users understand the system.
- A TLA ${ }^{+}$specification is a formal description to which tools can be applied to help find errors in the design and to help in testing the system. Some tools for TLA ${ }^{+}$specifications are being built.

Whether these benefits justify the effort of writing the specification depends on the nature of the project. Specification is not an end in itself; it is just one of many tools that an engineer should be able to use when appropriate.

## What to Specify

Although we talk about specifying a system, that's not what we do. A specification is a mathematical model of a particular view of some part of a system. When writing a specification, the first thing you must choose is exactly what part of the system you want to describe. Sometimes the choice is obvious; often it isn't. The cache-coherence protocol of a real multiprocessor computer may be intimately connected with how the processors execute instructions. Finding an abstraction that describes the coherence protocol while suppressing the details of instruction execution may be difficult. It may require defining an interface between the processor and the memory that doesn't exist in the actual system design.

Remember that the purpose of a specification is to help avoid errors. You should specify those parts of the system for which a specification is most likely to reveal errors. TLA ${ }^{+}$is particularly effective at revealing concurrency errors-ones that arise through the interaction of asynchronous components. So, you should concentrate your efforts on the parts of the system that are most likely to have such errors.

## The Grain of Atomicity

After choosing what part of the system to specify, you must choose the specification's level of abstraction. The most important aspect of the level of abstraction is the grain of atomicity, the choice of what system changes are represented as a single step of a behavior. Sending a message in an actual system involves multiple suboperations, but we usually represent it as a single step. On the other hand, the sending of a message and its receipt are usually represented as separate steps when specifying a distributed system.

The same sequence of system operations is represented by a shorter sequence of steps in a coarser-grained representation than in a finer-grained one. This almost always makes the coarser-grained specification simpler than the finer-grained one. However, the finer-grained specification more accurately describes the behavior of the actual system. A coarser-grained specification may fail to reveal important details of the system.

There is no simple rule for deciding on the grain of atomicity. However, there is one way of thinking about the granularity that can help. To describe it, we need the TLA ${ }^{+}$actioncomposition operator ".". If $A$ and $B$ are actions, then the action $A \cdot B$ is executed by first executing $A$ then $B$ in a single step. More precisely, $A \cdot B$ is the action defined by letting $s \rightarrow t$ be an $A \cdot B$ step iff there exists a state $u$ such that $s \rightarrow u$ is an $A$ step and $u \rightarrow t$ is a $B$ step.

When determining the grain of atomicity, we must decide whether to represent the execution of an operation as a single step or as a sequence of steps, each corresponding to the execution of a suboperation. Let's consider the simple case of an operation consisting of two suboperations that are executed sequentially, where those suboperations are described by the two actions $R$ and $L$. (Executing $R$ enables $L$ and disables $R$.) When the operation's execution is represented by two steps, each of those steps is an $R$ step or an $L$ step. The operation is then described with the action $R \vee L$. When its execution is represented by a single step, the operation is described with the action $R \cdot L \cdot{ }^{26}$ Let $S 2$ be the finer-grained specification in which the operation is executed in two steps, and let $S 1$ be the coarser-grained specification in which it is executed as a single $R \cdot L$ step. To choose the grain of atomicity, we must choose whether to take $S 1$ or $S 2$ as the specification. Let's examine the relation between the two specifications.

We can transform any behavior $\sigma$ satisfying $S 1$ into a behavior $\widehat{\sigma}$ satisfying $S 2$ by replacing each step $s \xrightarrow{R \cdot L} t$ with the pair of steps $s \xrightarrow{R} u \xrightarrow{L} t$, for some state $u$. If we regard $\sigma$ as being equivalent to $\widehat{\sigma}$, then we can regard $S 1$ as being a strengthened version of $S 2$ - one that allows fewer behaviors. Specification $S 1$ requires that each $R$ step be followed immediately by an $L$ step, while $S 2$ allows behaviors in which other steps come between the $R$ and $L$ steps. To choose the appropriate grain of atomicity, we must decide whether those additional behaviors allowed by $S 2$ are important.

The additional behaviors allowed by $S 2$ are not important if the actual system executions they describe are also described by behaviors allowed by $S 1$. So, we can ask whether each behavior $\tau$ satisfying $S 2$ has a corresponding behavior $\widetilde{\tau}$ satisfying $S 1$ that is, in some sense, equivalent to $\tau$. One way to construct $\widetilde{\tau}$ from $\tau$ is to transform a sequence of steps

$$
\begin{equation*}
s \xrightarrow{R} u_{1} \xrightarrow{A_{1}} u_{2} \xrightarrow{A_{2}} u_{3} \ldots u_{n} \xrightarrow{A_{n}} u_{n+1} \xrightarrow{L} t \tag{28}
\end{equation*}
$$

into the sequence

$$
\begin{equation*}
s \xrightarrow{A_{1}} v_{1} \ldots v_{k-2} \xrightarrow{A_{k}} v_{k-1} \xrightarrow{R} v_{k} \xrightarrow{L} v_{k+1} \xrightarrow{A_{k+1}} v_{k+2} \ldots v_{n+1} \xrightarrow{A_{n}} t \tag{29}
\end{equation*}
$$

where the $A_{i}$ are other system actions that can be executed between the $R$ and $L$ steps. Both sequences start in state $s$ and end in state $t$, but the intermediate states may be different.

[^19]When is such a transformation possible? An answer can be given in terms of commutativity relations. We say that actions $A$ and $B$ commute if performing them in either order produces the same result. Formally, $A$ and $B$ commute iff $A \cdot B$ is equivalent to $B \cdot A$. A simple sufficient condition for commutativity is that two actions commute if they change the values of different variables and neither enables or disables the other. It's not hard to see that we can transform (28) to (29) in the following two cases:

- $R$ commutes with each $A_{i}$. (In this case, $k=n$.)
- $L$ commutes with each $A_{i}$. (In this case, $k=0$.)

In general, if an operation consists of a sequence of $m$ subactions, we must decide whether to choose the finer-grained representation $O_{1} \vee O_{2} \vee \ldots \vee O_{m}$ or the coarser-grained one $O_{1}$. $O_{2} \cdots O_{m}$. The generalization of the transformation from (28) to (29) is one that transforms an arbitrary behavior satisfying the finer-grained specification into one in which the sequence of $O_{1}, O_{2}, \ldots, O_{m}$ steps come one right after the other. Such a transformation is possible if all but one of the actions $O_{i}$ commute with every other system action. Commutativity can be replaced by weaker conditions $[2,3,11]$. However, it is the most common case.

By commuting actions and replacing a sequence $s \xrightarrow{O_{1}} \cdots \xrightarrow{O_{m}} t$ of steps by a single $O_{1} \cdots O_{m}$ step, you may be able to transform any behavior of a finer-grained specification into a corresponding behavior of a coarser-grained one. But that doesn't mean that the coarser-grained specification is just as good as the finer-grained one. The sequences (28) and (29) are not the same, and a sequence of $O_{i}$ steps is not the same as a single $O_{1} \cdots O_{m}$ step. Whether you can consider the transformed behavior to be equivalent to the original one, and use the coarser-grained specification, depends on the particular system you are specifying and on the purpose of the specification. Understanding the relation between finer- and coarser-grained specifications can help you choose between them; it won't make the choice for you.

## The Data Structures

Another aspect of a specification's level of abstraction is the accuracy with which it describes the system's data structures. For example, should the specification of a programming interface describe the actual layout of a procedure's arguments in memory, or should the arguments be represented more abstractly?

To answer such a question, you must remember that the purpose of the specification is to help catch errors. A precise description of the layout of procedure arguments will help prevent errors caused by misunderstandings about that layout, but at the cost of complicating the programming interface's specification. The cost is justified only if such errors are likely to be a real problem and the TLA ${ }^{+}$specification provides the best way to avoid them.

If the purpose of the specification is to catch errors caused by the asynchronous interaction of concurrently executing components, then detailed descriptions of data structures will be a needless complication. So, you will probably want to use high-level, abstract descriptions of the system's data structures in the specification. For example, to specify a program interface, you might introduce constant parameters to represent the actions of calling and returning from a procedure - parameters analogous to Send and Reply of the memory interface described in Section 6.1.

## Writing the Specification

Once you've chosen the part of the system to specify and the level of abstraction, you're ready to start writing the TLA ${ }^{+}$specification. We've already seen how this is done; let's review the steps.

First, pick the variables and define the type invariant and initial predicate. In the course of doing this, you will determine the constant parameters and assumptions about them that you need. You may also have to define some additional constants.

Next, write the next-state action, which forms the bulk of the specification. Sketching a few sample behaviors may help you get started. You must first decide how to decompose the nextstate action as the disjunction of actions describing the different kinds of system operations. You then define those actions. The goal is to make the action definitions as compact and easy to read as possible. This requires carefully structuring them. One way to reduce the size of a specification is to define state predicates and state functions that are used in several different action definitions. When writing the action definitions, you will determine which of the standard modules you will need to use and add the appropriate EXTENDS statement. You may also have to define some constant operators for the data structures that you are using.

You must now write the temporal part of the specification. If you want to specify liveness properties, you have to choose the fairness conditions. You then combine the initial predicate, next-state action, and any fairness conditions you've chosen into the definition of a single temporal formula that is the specification.

Finally, you can assert theorems about the specification. If nothing else, you may want to add a type-correctness theorem.

## Some Further Hints

Here are a few miscellaneous suggestions that may help you write better specifications.
Don't be too clever. Cleverness can make a specification hard to read-and even wrong. The formula $q=\left\langle h^{\prime}\right\rangle \circ q^{\prime}$ may look like a nice, short way of writing:

$$
\begin{equation*}
\left(h^{\prime}=\operatorname{Head}(q)\right) \wedge\left(q^{\prime}=\operatorname{Tail}(q)\right) \tag{30}
\end{equation*}
$$

But not only is $q=\left\langle h^{\prime}\right\rangle \circ q^{\prime}$ harder to understand than (30), it's also wrong. We don't know what $a \circ b$ equals if $a$ and $b$ are not both sequences, so we don't know whether $h^{\prime}=\operatorname{Head}(q)$ and $q^{\prime}=\operatorname{Tail}(q)$ are the only values of $h^{\prime}$ and $q^{\prime}$ that satisfy $q=\left\langle h^{\prime}\right\rangle \circ q^{\prime}$. There could be other values of $h^{\prime}$ and $q^{\prime}$, which are not sequences, that satisfy the formula $q=\left\langle h^{\prime}\right\rangle \circ q^{\prime}$.

Don't assume that values are unequal just because they look different. The rules of TLA+ do not imply that $1 \neq$ "a". If the system can send a message that is either a string or a number, represent the message as a record with a type and value field-for example,

$$
[\text { type } \mapsto \text { "String", value } \mapsto " \mathrm{a} "] \quad \text { or } \quad[\text { type } \mapsto " N a t ", \text { value } \mapsto 1]
$$

Write comments as comments; don't put them into the specification itself. I have seen people write things like the following action definition:

$$
\begin{aligned}
A \triangleq \vee & \wedge x \geq 0 \\
& \wedge \cdots \\
\vee & \wedge x<0 \\
& \wedge \text { FALSE }
\end{aligned}
$$

The second disjunct is meant to indicate that the writer intended $A$ not to be enabled when $x<0$. But that disjunct is completely redundant and serves only as a form of comment. It's better to write:

$$
\begin{aligned}
A \triangleq & \wedge x \geq 0 \quad A \text { is not enabled if } x<0 \\
& \wedge \ldots
\end{aligned}
$$

## When and How to Specify

Specifications are often written later than they should be. Engineers are usually under severe time constraints, and they may feel that writing a specification will slow them down. Only after a design has become so complex that they need help understanding it do engineers think about writing a precise specification.

Writing a specification helps you think clearly. Thinking clearly is hard; we can use all the help we can get. Making specification part of the design process can improve the design.

I have described how to write a specification assuming that the system design already exists. But it's better to write the specification as the system is being designed. The specification will start out being incomplete and probably incorrect. For example, an initial specification of the write-through cache of Section 6.6 might include the definition:

```
RdMiss(p) \triangleq Enqueue a request to write value from memory to p's cache.
    Some enabling condition must be conjoined here.
    ^memQ' = Append (memQ,buf[p]) Append request to memQ.
    \wedgecl' = [ctl EXCEPT ! [p] = "?"] Set ctl[p] to value to be determined later.
    ^ UNCHANGED <memInt, mem, buf, cache\rangle
```

Some system functionality will at first be omitted; it can be included later by adding new disjuncts to the next-state action. Tools can be applied to these preliminary specifications to help find design errors.

## References

[1] Martín Abadi and Leslie Lamport. Conjoining specifications. ACM Transactions on Programming Languages and Systems, 17(3):507-534, May 1995.
[2] Ernie Cohen and Leslie Lamport. Reduction in tla. In David Sangiorgi and Robert de Simone, editors, CONCUR'98 Concurrency Theory, volume 1466 of Lecture Notes in Computer Science, pages 317-331. Springer-Verlag, 1998.
[3] Thomas W. Doeppner, Jr. Parallel program correctness through refinement. In Fourth Annual ACM Symposium on Principles of Programming Languages, pages 155-169. ACM, January 1977.
[4] M.P. Herlihy and J.M. Wing. Axioms for concurrent objects. In Proceedings of the Fourteenth Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 13-26, Munich, January 1987. ACM.
[5] Donald E. Knuth. The $T_{E}$ Xbook. Addison-Wesley, Reading, Massachusetts, 1994.
[6] Peter Ladkin, Leslie Lamport, Bryan Olivier, and Denis Roegel. Lazy caching in TLA. To appear in Distributed Computing.
[7] Leslie Lamport. TLA-temporal logic of actions. At URL http://www.research. digital.com/SRC/tla/ on the World Wide Web. It can also be found by searching the Web for the 21-letter string formed by concatenating uid and lamporttlahomepage.
[8] Leslie Lamport. The temporal logic of actions. ACM Transactions on Programming Languages and Systems, 16(3):872-923, May 1994.
[9] Leslie Lamport. How to make a correct multiprocess program execute correctly on a multiprocessor. IEEE Transactions on Computers, 46(7):779-782, July 1997.
[10] A. C. Leisenring. Mathematical Logic and Hilbert's $\varepsilon$-Symbol. Gordon and Breach, New York, 1969.
[11] Richard J. Lipton. Reduction: A method of proving properties of parallel programs. Communications of the ACM, 18(12):717-721, December 1975.
[12] Amir Pnueli. The temporal logic of programs. In Proceedings of the 18th Annual Symposium on the Foundations of Computer Science, pages 46-57. IEEE, November 1977.

## Logic

$$
\begin{aligned}
& \wedge \vee \neg \Rightarrow \\
& \text { TRUE } \quad \text { FALSE } \quad \text { BOOLEAN [the set }\{\text { TRUE, FALSE }\}] \\
& \forall x: p \quad \exists x: p \quad \forall x \in S: p \quad{ }^{(1)} \quad \exists x \in S: p \quad{ }^{(1)} \\
& \text { CHOOSE } x: p \quad[\text { An } x \text { satisfying } p] \quad \text { CHOOSE } x \in S: p \quad[\text { An } x \in S \text { satisfying } p]
\end{aligned}
$$

## Sets

| $\neq \in \notin$ | $\cap \subseteq \ \text { [set difference }]$ |
| :---: | :---: |
| cis S:p\} | [Set of elements $x$ in $S$ satisfying $p$ ] |
| $\{e: x \in S\}^{(1)}$ | [Set of elements $e$ such that $x$ in $S$ ] |
| SUBSET $S$ | [Set of subsets of $S$ ] |
| Union $S$ | [Union of all elements of $S$ ] |

## Functions ${ }^{(3)}$

| $f[e]$ | [Function application] |
| :--- | :--- |
| DOMAIN $f$ | [Domain of function $f]$ |
| $[x \in S \mapsto e]{ }^{(1)}$ | [Function $f$ such that $f[x]=e$ for $x \in S]$ |
| $[S \rightarrow T]$ | [Set of functions $f$ with $f[x] \in T$ for $x \in S]$ |
| $\left[f \text { EXCEPT }!\left[e_{1}\right]=e_{2}\right]^{(4)}$ | $\left[\right.$ [Function $\hat{f}$ equal to $f$ except $\left.\widehat{f}\left[e_{1}\right]=e_{2}\right]$ |

## Records

e.h [The $h$-component of record $e$ ]
[ $h_{1} \mapsto e_{1}, \ldots, h_{n} \mapsto e_{n}$ ] [The record whose $h_{i}$ component is $e_{i}$ ]
[ $h_{1}: S_{1}, \ldots, h_{n}: S_{n}$ ] [Set of all records with $h_{i}$ component in $S_{i}$ ]
$[r \text { EXCEPT ! } . h=e]^{(4)} \quad[$ Record $\widehat{r}$ equal to $r$ except $\widehat{r} . h=e]$

## Tuples

| $e[i]$ | [The $i^{\text {th }}$ component of tuple $e$ ] |
| :--- | :--- |
| $\left\langle e_{1}, \ldots, e_{n}\right\rangle$ | [The $n$-tuple whose $i^{\text {th }}$ component is $e_{i}$ ] |
| $S_{1} \times \ldots \times S_{n}$ | [The set of all $n$-tuples with $i^{\text {th }}$ component in $S_{i}$ ] |

## Strings and Numbers

| "c $c_{1} \ldots \mathrm{c}_{n} "$ | [A literal string of $n$ characters] |
| :--- | :--- |
| STRING | [The set of all strings] |
| $d_{1} \ldots d_{n}$ | $d_{1} \ldots d_{n} \cdot d_{n+1} \ldots d_{m}$ | [Numbers]

## Miscellaneous Constructs

IF $p$ THEN $e_{1}$ ELSE $e_{2} \quad$ [ $e_{1}$ if $p$ true, else $e_{2}$ ]
CASE $p_{1} \rightarrow e_{1} \square \ldots \square p_{n} \rightarrow e_{n} \quad$ [Some $e_{i}$ such that $p_{i}$ true]
CASE $p_{1} \rightarrow e_{1} \square \ldots \square p_{n} \rightarrow e_{n} \square$ OTHER $\rightarrow e \quad\left[\right.$ Some $e_{i}$ such that $p_{i}$ true, or $e$ if all $p_{i}$ are false]
LET $d_{1} \triangleq e_{1} \ldots d_{n} \triangleq e_{n}$ IN $e \quad$ [ $e$ in the context of the definitions]

(1) $x \in S$ may be replaced by a comma-separated list of items $v \in S$, where $v$ is either a commaseparated list or tuple of identifiers.
(2) $x$ may be an identifier or tuple of identifiers.
(3) We describe only functions of a single argument; $\mathrm{TLA}^{+}$also allows functions with multiple arguments.
(4)! $\left[e_{1}\right]$ or !. $h$ may be replaced by a comma separated list of items ! $a_{1} \cdots a_{n}$, where each $a_{i}$ is $\left[e_{i}\right]$ or . $h_{i}$.

Figure 12: The constant operators.

Predicate and Action Operators
$e^{\prime}$
$[A]_{e}$
$\langle A\rangle_{e}$
Enabled $A$
UNCHANGED $e$
$A \cdot B$
[The value of $e$ in the final state of a step]
$\left[A \vee\left(e^{\prime}=e\right)\right]$
$\left[A \wedge\left(e^{\prime} \neq e\right)\right]$
[An $A$ step is possible]
[ $e^{\prime}=e$ ]
[Composition of actions]

## Temporal Operators

| $\square F$ | $[F$ is always true $]$ |
| :--- | :--- |
| $\diamond F$ | $[F$ is eventually true $]$ |
| $\mathrm{WF}_{e}(A)$ | [Weak fairness for action $A]$ |
| $\mathrm{SF}_{e}(A)$ | $[$ Strong fairness for action $A]$ |
| $F \leadsto G$ | $[F$ leads to $G]$ |
| $F \pm G$ | $[F$ guarantees $G$ (see [1])] |
| $\exists \rightarrow F$ | [Temporal existential quantification (hiding).] |
| $\forall x: F$ | [Temporal universal quantification.] |

Figure 13: The nonconstant operators of TLA ${ }^{+}$.

## Infix Operators

| + ${ }^{(1)}$ | - ${ }^{(1)}$ | * (1) | / ${ }^{(2)}$ | - ${ }^{(3)}$ | ++ |
| :---: | :---: | :---: | :---: | :---: | :---: |
|  | \% ${ }^{(1)}$ | - $(1,4)$ | .. ${ }^{(1)}$ | ... | - |
| $\oplus{ }^{(5)}$ | $\theta{ }^{(5)}$ | * | $\bigcirc$ | $\odot$ | ** |
| $<{ }^{(1)}$ | $>{ }^{(1)}$ | $\leq{ }^{(1)}$ | $\geq{ }^{(1)}$ | $\square$ | // |
| $\prec$ | $\succ$ | $\preceq$ | $\succeq$ | $\sqcup$ |  |
| $\ll$ | $\gg$ | $<$ : | : > | \& |  |
| $\sqsubset$ | $\sqsupset$ | $\sqsubseteq{ }^{(5)}$ | $\sqsupseteq$ | \| | \|| |
| $\subset$ | $\bigcirc$ |  | $\bigcirc$ | * | \%\% |
| $\vdash$ | $\dashv$ | $1=$ | = | $\bullet$ | \#\# |
| $\sim$ | $\simeq$ | $\approx$ | $\cong$ | \$ | \$ |
| : | ::= | $\asymp$ | $\pm$ | . | ?? |
| $\propto$ | 2 | $\uplus$ | $\bigcirc$ | ! | @@ |

## Postfix Operators ${ }^{(6)}$ <br> - + * $\#$

## Prefix Operator

(1) Defined by the Naturals, Integers, and Reals modules.
(2) Defined by the Reals module.
(3) Defined by the Sequences module.
(4) $x^{\wedge} y$ is printed as $x^{y}$.
(5) Defined by the Bags module.
(6) $e^{\wedge}+$ is printed as $e^{+}$, and similarly for ${ }^{\wedge} *$ and ${ }^{\wedge} \#$.

Figure 14: User-definable operator symbols.

| $\wedge$ | 八 or \land | V | \／or \lor | $\Rightarrow$ | ＝＞ |
| :---: | :---: | :---: | :---: | :---: | :---: |
| $\neg$ | $\sim$ or \lnot or \neg | 三 | $\Leftrightarrow$ or \equiv | $\triangleq$ | ＝＝ |
| $\epsilon$ | \in | $\neq$ | \＃or／＝ | ， | ， |
| ＜ | ＜＜ | ＞ | ＞＞ | $\square$ | ［］ |
| $<$ | ＜ | ＞ | ＞ | $\diamond$ | ＜＞ |
| $\leq$ | \leq or $=<$ | $\geq$ | \geq or＞＝ | $\sim$ | ～＞ |
| $\ll$ | \ll | $>$ | $\backslash \mathrm{gg}$ | $\stackrel{ \pm}{\square}$ | －＋－＞ |
| $\prec$ | \prec | $\succ$ | \succ | $\mapsto$ | ｜－＞ |
| $\preceq$ | $\backslash$ preceq | $\succeq$ | $\backslash$ succeq | $\div$ | $\backslash$ div |
| $\subseteq$ | \subseteq | $\supseteq$ | \supseteq | 2 | \wr |
| $\subset$ | $\backslash$ \ubset | $\bigcirc$ | \supset | $\bullet$ | \bullet |
| $\sqsubset$ | \sqsubset | $\sqsupset$ | \sqsupset | $\star$ | \star |
| $\square$ | $\backslash$ \qsubseteq | $\sqsupseteq$ | $\backslash$ \qsupseteq | $\sim$ | $\backslash$ sim |
| $\vdash$ | 1－ | $\dashv$ | －1 | $\simeq$ | $\backslash$ simeq |
| $\stackrel{ }{ }$ | I $=$ | $=$ | ＝ 1 | $\asymp$ | \asymp |
| $\rightarrow$ | －＞ | $\leftarrow$ | ＜－ | $\approx$ | \approx |
| $\cap$ | \cap or \intersect | $\cup$ | \cup or \union | $\cong$ | \cong |
| $\sqcap$ | \sqcap | $\sqcup$ | \sqcup | $\stackrel{ }{=}$ | $\backslash$ doteq |
| $\oplus$ | （＋）or \oplus | $\uplus$ | \uplus | $\propto$ | $\backslash$ propto |
| $\ominus$ | （－）or \ominus | $\times$ | $\backslash \mathrm{X}$ or \times | $x^{y}$ | $\mathrm{x}^{\sim} \mathrm{y}^{(2)}$ |
| $\odot$ | （．）or \odot | ． | \cdot | $x^{+}$ | $\mathrm{x}^{\wedge}+{ }^{(2)}$ |
| $\otimes$ | \otimes | $\bigcirc$ | \o or \circ | $x^{*}$ | $\mathrm{x}^{\wedge}{ }^{(2)}$ |
| $\oslash$ | （／）or \oslash | ＂s＂ | ＂s＂（1） |  | $\mathrm{x}^{\wedge} \#^{(2)}$ |
| 「 | －－－－－－ | 3） |  |  | －－－－${ }^{(3)}$ |
|  | －－－－－－－－ |  | $\square$ | ＝$=$ | $===={ }^{(3)}$ |

（1）$s$ is any sequence of characters，not including＂．
（2）$x$ and $y$ are any expressions．
（3）a sequence of four or more - or $=$ characters．
Figure 15：The ASCII representations of typeset symbols．


[^0]:    ${ }^{2}$ Some readers may need reminding that numbers are not strings of bits, and $2^{33} * 2^{33}$ equals $2^{66}$, not overflow error.

[^1]:    ${ }^{3}$ iff stands for if and only if.
    ${ }^{4}$ Like most mathematicians, I use or to mean and/or.
    ${ }^{5}$ Section 7.2 touches on a more subtle reason to write $\equiv$ instead of $=$.

[^2]:    ${ }^{6}$ Strictly speaking, $\in$ isn't an operator of predicate logic, so this isn't really a predicate logic tautology.

[^3]:    ${ }^{7}$ I like to use a singular noun like Message rather than a plural like Messages for the name of a set. That way, the $\in$ in the expression $m \in$ Message can be read is $a$. This is the same convention that most programmers use for naming types.

[^4]:    ${ }^{8} \mathrm{TLA}^{+}$is based on the mathematical formalism known as Zermelo-Fränkel set theory, also called ZF.

[^5]:    ${ }^{9}$ Even if TLA ${ }^{+}$allowed us to declare an action parameter, we would have no way to specify that a Send $(p, d)$ action constrains only memInt and not other variables.
    ${ }^{10}$ We expect $\operatorname{Send}(p, d$, miOld, miNew $)$ to have this meaning only when $p$ is a processor and $d$ a value that $p$ is allowed to send, but we simplify the specification a bit by requiring it to be a Boolean for all values of $p$ and $d$.

[^6]:    ${ }^{11}$ Computer scientists write $\lambda x$ : S.e to mean something very much like $[x \in S \mapsto e]$. Such $\lambda$ expressions aren't quite the same as the functions of ordinary mathematics, so TLA ${ }^{+}$doesn't use that notation for writing functions.

[^7]:    ${ }^{12}$ This main memory does not directly correspond to the memory represented by variable mem in module InternalMemory.

[^8]:    ${ }^{13}$ An invariant of a specification $S$ that is also an invariant of its next-state action is sometimes called an inductive invariant of $S$.

[^9]:    ${ }^{14}$ Mathematicians usually write $\bigcup S$ instead of UnION $S$. They often call SUBSET $S$ the powerset of $S$ and write it $\mathcal{P}(S)$ or $2^{S}$.

[^10]:    ${ }^{15}$ Equality $(=)$ is not an operator of propositional or predicate logic; this tautology need not be valid for nonBoolean values if $\equiv$ is replaced by $=$.

[^11]:    ${ }^{16}$ If you don't know what an irreflexive partial order is, read the following definition of IsPartialOrder to find out.

[^12]:    ${ }^{17}$ The Choose operator is known to logicians as Hilbert's $\varepsilon[10]$.

[^13]:    ${ }^{18}$ This is a completely new sense of the word invariant; it has nothing to do with the concept of invariance discussed already.

[^14]:    ${ }^{19}$ It is because we think of $\sigma_{n}$ as the state at time $n$, and because we usually measure time starting from 0 , that I start numbering the states of a behavior with 0 rather than 1 .

[^15]:    ${ }^{20} \square(F \Rightarrow \diamond G)$ equals $F \leadsto G$, so we could write this formula more compactly with $\sim$. However, it is more convenient to keep it in the form $\square(F \Rightarrow \diamond G)$

[^16]:    ${ }^{21}$ Although we haven't yet discussed quantification in temporal formulas, the meaning of the formula $\forall p \in$ Proc : ... should be clear.

[^17]:    ${ }^{22}$ Logicians use the term flexible variable for a TLA variable, and the term rigid variable for a symbol like $r$ that represents a constant.
    ${ }^{23}$ We can let $\exists r \in S: F$ equal $\exists r:(r \in S) \wedge F$, which makes sense if $S$ is a state function, not just a constant. However, TLA ${ }^{+}$requires $S$ to be a constant in $\exists r \in S: F$. If you want it to be a state function, you have to write $\exists r:(r \in S) \wedge F$.
    ${ }^{24}$ The use of 0 is arbitrary; any fixed value would do.

[^18]:    ${ }^{25}$ More precisely, this is the case for a finite or countably infinite conjunction of properties.

[^19]:    ${ }^{26}$ We actually describe the operation with an ordinary action, like the ones we've been writing, that is equivalent to $R \cdot L$. The operator "." rarely appears in an actual specification. If you're ever tempted to use it, look for a better way to write the specification; you can probably find one.

