Intermezzo 2 Records

Leslie Lamport

Last modified on 28 November 2021

It's time for another pause from the study of algorithms to learn another important TLA+ data type: the record. When mathematicians want to bundle several values together as a single value, they use a tuple. For example, suppose they want to talk about the kind of graph used in Session 10, consisting of a set of nodes, a root node, and a set of edges. They would define a graph G to be a triple. For example, the graph G with set {r,n1,n2} of nodes, root r, and set {{r,n1},{n1,n2}} of edges might be written:

   <<{r,n1,n2}, r, {{r,n1},{n1,n2}}>>

Programmers realized that this leads to errors, because it's easy to forget if the root is G[1] or G[2]. Programming languages introduced records, which are like tuples except the components are named rather than numbered. The components of the graph G could be named G.nodes, G.root, and G.edges. These components were called fields, and the graph G had the three fields named nodes, root, and edges.

Programming languages later added a lot of additional stuff to records and called them objects. TLA+ and PlusCal keep the original concept of a record as simply a value, just like a tuple or a number. It introduces a simple way to write records, where the graph G can be written:

   [nodes |-> {r,n1,n2}, root |-> r, edges |-> {{r,n1},{n1,n2}}]

TLA+ and PlusCal also define this record to be a function whose domain is the set consisting of the three strings {"nodes", "root", "edges"} and defines G.nodes to be an abbreviation of G["nodes"]. This means that the assignment statements G.nodes := ... and G["nodes"] := ... are equivalent. It also means that there is no ordering of the fields, so

[a |->
1, b |-> 2]

is the same record as [b |-> 2, a |-> 1].

Field names cannot be arbitrary strings. They should be legal identifiers. A declared or defined identifier can also be used as a field name.

There is also a way to write the set of all records whose fields are elements of certain sets. For example, the set of all records with integer-valued x and y fields and a zz field that is an element of a set S is written:

   [x : Int, y : Int, zz : S]

Exercise 1 Write this set of records using the notation for describing individual records and the set constructs you learned in Session 4. Show the answer.

Answer {[x |-> i, y |-> j, zz |-> s] : i, j \in Int, s \in S}

Such a set of records is often used in writing a type-correctness invariant for a variable whose value is a record or contains records.

In the algorithm of Session 10, we could have let the message queue from node n to node m be the record [from |-> n, to |-> m] instead of the pair <<n, m>>. Writing msgs[q].from might have made the algorithm a little easier to understand, but msgs[q][1] is shorter. Which to use is a matter of taste.

A common use of records is to bundle several pieces of information into a message. If different messages can contain different kinds of information, the messages can all contain a field, often called type, whose value indicates what kind of information the message contains. The action that processes the message tests that field to determine what to do.

An algorithm can also determine what kind of information is in a record by checking what fields the record contains. For any function f, the expression DOMAIN f equals the domain of f. Thus, DOMAIN [a |-> 1, b |-> 2] equals {"a","b"}.

Exercise 2 Define an operator IncrCnt such that if r is any record containing an integer-valued count field, then IncrCnt(r) is the record obtained from r by incrementing its count field by 1. (Thus the statement x.count := x.count + 1 is equivalent to x := IncrCnt(x) if x is a record with a count field.) Show the answer.

Answer

   IncrCnt(r) == [d \in DOMAIN r |-> 
                    IF d = "count" THEN r.count + 1 ELSE r[d]]