For basic TLA+ notation, refer to the beginning of the Layer 1 slides (slides_layer1_en.html).
New notation introduced in these slides:
| Notation | Meaning |
|---|---|
MakePacket(node, payload, sub, miss) | Record construction function. Returns [payload |-> ..., sub |-> ..., missing |-> ..., node |-> ...]. |
PriorityWrapper(packet, ser) | PacketWrapper record. hasPriority=TRUE. |
BundledRefWrapper(parentNode, ser) | Wrapper in bundled state. hasPriority=FALSE, bundledBy=parentNode. |
[n \in AllNodes |-> IF ... THEN ... ELSE ...] | Conditional function definition. Sets different initial values per node. |
r.field | Record field access. Examples: w.hasPriority, w.packet.sub[c] |
% MaxSerial | Modular arithmetic. The serial wraps around at MaxSerial, making the state space structurally finite. This model does not use StateConstraint. |
ModGT(a, b) | Modular comparison. LET diff == (a - b + MaxSerial) % MaxSerial IN diff > 0 /\ diff < MaxSerial \div 2. Corresponds to signed difference comparison in C++. |
Characteristics of this model: The variable local has a nested record structure (local[t].wrapper.packet.sub[c], etc.).
This corresponds to nested structs in C++. In TLA+, nested fields are updated with [local EXCEPT ![t].wrapper = w].
The most complex model. Integrated verification of 4-phase bundle, multi-level unbundle,
and optimistic commit on 2-level and 3-level trees. The single-node CAS commit from the old "Layer 1
(stm_commit)" is now subsumed into this layer's CommitTryCAS direct path
(2-layer architecture).
Model supports 3 fidelity tiers (coarse / fine / superfine — see next slide).
(* Packet: payload + child packets + missing flag *)
MakePacket(node, payload, sub, miss) ==
[payload |-> payload, sub |-> sub, missing |-> miss, node |-> node]
(* PacketWrapper: priority (independent) or bundled (absorbed by parent) *)
PriorityWrapper(packet, ser) ==
[packet |-> packet, hasPriority |-> TRUE, bundledBy |-> Null, serial |-> ser]
BundledRefWrapper(parentNode, ser) ==
[packet |-> Null, hasPriority |-> FALSE, bundledBy |-> parentNode, serial |-> ser]
VARIABLES
serial, \* per-thread serial counter
globalSerial, \* latest global serial
linkage, \* linkage[n]: PacketWrapper for each node
pc, op, target, local \* per-thread state
This layer's TLA+ spec can verify the same invariants at different granularities.
Granularity is selected via constant flags in the .cfg file. Finer modes increase state
count but also raise fidelity to the C++ implementation.
| Mode | Characteristics | Use case |
|---|---|---|
| coarse | Phases collapsed into single atomic steps (fastest). May miss some interleavings, but high-level invariants are still verifiable. | Smoke tests on large models (3-level, 3-thread, etc.) |
| fine | Per-child CAS with separate failure paths. Phase3 rollback is omitted (matching C++). | Standard exhaustive verification |
| superfine | Five C++-faithful optimization paths enabled: #1 pre-bundle serial CAS, #2 inner-bundle phase split, #4 Phase3 DISTURBED check, #5 root-first unbundle ordering, #6 same-child retry | Strict correspondence with C++ |
An example of a bug introduced during TLA+ modeling: BundlePhase3's
allDone check was only testing ~hasPriority, so it incorrectly
counted children already bundled by another thread as "done." Fixed by matching against
BundledRefWrapper(node, ser) for the current thread. The C++ implementation
runs CAS on every child without a "skip if already bundled" optimization, so it never
exhibited this bug — it was a TLA+ model-only issue.
BundlePhase1(t) ==
/\ LET childPkts == [c \in children |->
IF childWs[c].hasPriority /\ ~childWs[c].packet.missing
THEN childWs[c].packet \* child is complete → acquire
ELSE IF childWs[c].bundledBy = node
THEN parentW.packet.sub[c] \* already bundled → use existing
ELSE Null] \* cannot acquire → restart
IN
IF allCollected
THEN pc' → "bundle_phase2"
ELSE pc' → "snap_check" \* restart
BundlePhase2(t) ==
/\ LET newPkt == MakePacket(node, payload, subs, TRUE)
newW == PriorityWrapper(newPkt, ser)
IN
IF linkage[node] = oldW \* CAS: if unchanged
THEN linkage' = [linkage EXCEPT ![node] = newW]
BundlePhase3(t) ==
/\ IF allMatch \* all children match expected values
THEN linkage' = [n \in AllNodes |->
IF n \in children
THEN BundledRefWrapper(node, ser) \* child → back-reference to parent
ELSE linkage[n]]
BundlePhase4(t) ==
/\ LET finalPkt == MakePacket(node, payload, sub, FALSE) \* ← FALSE!
IN
IF linkage[node] = oldW
THEN linkage' = [linkage EXCEPT ![node] = finalW]
\* → consistent snapshot is now possible
To commit to a bundled child node, the ancestor bundle chain must first be disassembled.
Starting from the target child, we follow bundledBy upward to identify the
ancestors, set each ancestor's missing to TRUE, and then restore the child
to a priority wrapper.
UnbundleWalk(t) ==
\* Starting from the target child, walk the bundledBy chain upward
\* and build the list casTargets of ancestor nodes to CAS
\* (child → parent → (at 3-level) grand)
/\ local' = [local EXCEPT ![t].casTargets = chain]
\* superfine #5: at 3-level, build the list root-first
\* (matching the C++ traversal order: CAS from root → leaf)
UnbundleCASAncestors(t) \* 2-level (BundleUnbundle_2level.tla)
UnbundleCASLoop(t) \* 3-level (BundleUnbundle.tla)
\* For each ancestor in casTargets, CAS its wrapper to a new wrapper
\* with missing=TRUE. The child sub-slots sub[c] are KEPT (not nulled),
\* so Phase3 rollback / recovery paths still have the references available.
/\ LET newPkt == MakePacket(anc, anc.payload, anc.sub, TRUE)
newW == PriorityWrapper(newPkt, ser)
IN linkage' = [linkage EXCEPT ![anc] = newW]
\* coarse: all ancestors in one step; fine: one ancestor per step
UnbundleCASChild(t) ==
\* Restore the target child from BundledRefWrapper to PriorityWrapper,
\* carrying the new (committed) payload.
/\ LET newChildW == PriorityWrapper(newpacket, ser)
IN linkage' = [linkage EXCEPT ![node] = newChildW]
Unbundle flow (shared across depths):
(1) UnbundleWalk builds the ancestor list casTargets → (2) CAS each ancestor to missing=TRUE (child slots preserved) → (3) CAS the target child back to Priority with the new payload.
Important: ancestor sub[c] slots are not nulled out. The missing=TRUE flag alone signals "this subtree is temporarily not snapshottable," and keeping sub intact allows Phase3 rollback or later recovery to find the references it needs.
superfine #5 (3-level only): casTargets is built in root-first order
(grand → parent → child direction) and ancestor CAS also runs root-first, matching the C++
traversal. This faithfully reproduces the race cases against concurrent bundle/unbundle.
The model has two commit families:
CommitRead → CommitTryCAS → CommitDone.
Performs the ordinary +1 to a child payload, inserting an unbundle when the target is bundled.CommitParent (2-level) / CommitGrand (3-level).
Takes a snapshot at parent / grand scope and applies +1 to every child payload inside that scope.
This stresses whether payload integrity is preserved through bundle/unbundle chains.CommitParent(t) \* 2-level: parent-scope +1 propagated to every Child payload
CommitGrand(t) \* 3-level: grand-scope +1 propagated to every Grandchild payload
\* Single CAS against the parent / grand wrapper. Updates every payload inside
\* sub[c] to (p + 1) % MaxPayload and commits the whole scope in one action.
\* Even bundled subtrees must see their internal payloads advance by +1.
CommitRead(t) ==
\* Reload linkage[node] immediately before attempting the CAS
/\ local' = [local EXCEPT ![t].wrapper = linkage[target[t]]]
CommitTryCAS(t) ==
IF linkage[node] = oldW
THEN linkage' = [linkage EXCEPT ![node] = newW] \* CAS success → commit_done
ELSE IF linkage[node].hasPriority
THEN
IF linkage[node].packet.payload = oldW.packet.payload
THEN \* ★ Single-node optimization: same payload → re-adopt new child and retry ★
local' = [local EXCEPT ![t].wrapper = linkage[node], ...]
ELSE \* True conflict → fail
ELSE
\* Bundled → transition to unbundle_walk
pc' → "unbundle_walk"
CommitDone(t) ==
\* Commit finalization: decrement iterBudget, pop next child from childQueue
\* If the queue is empty, return to idle; if iterBudget=0 as well, move to terminal
/\ iterBudget' = [iterBudget EXCEPT ![t] = @ - 1]
iterBudget / childQueue / MaxCommits:
each thread runs MaxCommits iterations. Each iteration consists of
"one parent / grand-scope commit (CommitParent / CommitGrand) plus
one direct commit per child (CommitRead / TryCAS / Done)."
childQueue[t] holds the children still pending in the current iteration and
iterBudget[t] the remaining iterations.
TerminalPayloadCheck: in the terminal state where all threads have
iterBudget=0 and are idle, each direct child's payload must equal
(2 × MaxCommits × |Threads|) % MaxPayload.
This guarantees that both the deep (+1) and direct (+1) commit streams were applied correctly.
Single-node optimization: On CAS failure, if linkage[node].hasPriority is still
true and the payload matches our expected value, treat it as "only the child slot was swapped by
another thread," adopt the new child packet, and retry the CAS. This reduces wasted aborts.
\* 1. If missing=FALSE then all child packets exist
SnapshotConsistency ==
\A n \in InnerNodes :
(w.hasPriority /\ ~w.packet.missing) =>
(\A c \in ChildrenOf(n) : w.packet.sub[c] /= Null)
\* 2. Non-root nodes have priority OR bundledBy≠Null
NoPriorityLoss ==
\A n \in AllNodes \ {Grand} :
w.hasPriority \/ w.bundledBy /= Null
\* 3. bundledBy chains are valid (reach priority within 2 hops)
BundleChainValid ==
\A n \in AllNodes \ {Grand} :
(~w.hasPriority /\ w.bundledBy /= Null) =>
pw.hasPriority \/ pw.bundledBy /= Null
\* 4. bundledBy always points to the structural parent
BundledByCorrect ==
\A n \in AllNodes \ {Grand} :
~w.hasPriority => w.bundledBy = ParentOf(n)
\* 5. Root always has priority
GrandAlwaysPriority ==
linkage[Grand].hasPriority
| Model | Fidelity | Threads | MaxCommits | States | Time | Result |
|---|---|---|---|---|---|---|
| 2-level | coarse | 2 | 1 | 906K | 12s | ✅ Pass (complete) |
| 2-level | coarse | 2 | 5 | ~3.2M | — | ✅ Pass (complete) |
| 3-level | coarse | 2 | 1 | 110K | 3s | ✅ Pass (complete) |
| 3-level | coarse | 2 | 5 | ~8.5M | — | ✅ Pass (complete) |
| 3-level | all-fine | 2 | 1 | 1.2M | 22s | ✅ Pass (complete) |
| 2/3-level | fine/superfine (deep) | 2–3 | 5 | — | — | pending ohtaka |
| Serial finitized via modular arithmetic; no StateConstraint used. Production configs: MaxSerial=512 (2-level) / 1024 (3-level). | ||||||
What is covered: Across all 2-thread interleavings of snapshot/commit/bundle/unbundle, all 5 invariants are preserved. Coarse collapses phases into single steps (fastest); fine separates per-child CAS; superfine models the 5 C++ optimization paths faithfully.
Bug found in TLA+: In the fine granularity, BundlePhase3's
allDone test was only checking ~hasPriority, so it incorrectly
counted children already bundled by another thread as "done." Fixed by matching against
BundledRefWrapper(node, ser) for the current thread. (The C++ implementation
runs CAS on every child without a "skip if already bundled" optimization, so it never
exhibited the bug.)
| Layer | What It Abstracts | What It Verifies | States |
|---|---|---|---|
| Layer 1 (atomic_shared_ptr) | — (hardware directly) | Reference counting, ABA, memory safety | 8.7M (2thr complete) |
| Layer 2 (Bundle/Unbundle + Commit) | atomic_shared_ptr as correct CAS register | Bundle/Unbundle + optimistic commit, subtree consistency | 906K–8.5M (coarse/fine complete; superfine pending) |
Why only 2 layers: The former "Layer 1 (stm_commit)" single-node CAS commit
integrates naturally into Layer 2's CommitTryCAS direct path (single-node CAS commit),
so a separate layer was redundant. Verifying bundle/unbundle together with optimistic commit
in the same spec improves coverage of interference patterns.
Layer 1 is fully explored; Layer 2 coarse/all-fine are complete; superfine and deep fine
runs are pending on ohtaka.
| TLA+ Variable | C++ (transaction_impl.h) | Generated C11 (test_bundle_3level.c) |
|---|---|---|
linkage[n] | atomic_shared_ptr<PacketWrapper> n->m_link | _Atomic(uint64_t) linkage[4]64-bit packed (Wrapper struct) |
linkage[n].hasPriority | PacketWrapper::hasPriority() | Wrapper.has_priority (bit field) |
linkage[n].bundledBy | PacketWrapper::bundledBy() | Wrapper.bundled_by |
linkage[n].packet.payload | Packet::payload() | Wrapper.payload |
linkage[n].packet.sub[c] | Packet::subpackets()[i] | Wrapper.sub_child1/2 |
serial[t] | SerialGenerator (Lamport clock) | uint8_t thread_ser (local) |
ModGT(a,b) | Unsigned subtraction → signed reinterpretation | mod_gt(a,b): (a-b+MAX)%MAX < MAX/2 |
| TLA+ Action | C++ Code | Generated C11 Function |
|---|---|---|
SnapRead + SnapCheck(3-level uses only SnapCheck) | snapshot()transaction_impl.h:842-870 | snapshot_grand()load_wrapper() → priority/missing decision |
BundlePhase1–4 | bundle()transaction_impl.h:1077-1171 | Within snapshot_grand()collect → CAS parent → CAS children → finalize |
InnerPhase2/3/4(3-level only) | Inner-bundle portion of bundle()transaction_impl.h:1100-1171 | Within snapshot_grand()Inner bundle on Parent when Grand bundles (explicit in superfine; collapsed into BundlePhase1 in fine) |
CommitParent (2-level)CommitGrand (3-level) | Parent / grand-scope commit()transaction_impl.h:1245-1270 | commit_parent() / commit_grand()Deep commit: +1 to every child payload in the scope |
CommitRead | Reload just before CAS in commit()transaction_impl.h:1245-1260 | Within commit_child()load_wrapper immediately before the CAS attempt |
CommitTryCAS | commit()transaction_impl.h:1245-1270 | commit_child()Direct CAS + single-node optimization |
CommitDone | Finalization tail of commit()transaction_impl.h:1265-1275 | Tail of commit_child()Decrement iterBudget, advance to next child or idle |
UnbundleWalk | snapshotSupernode()transaction_impl.h:696-755 | Within commit_child()Walk bundledBy chain to build casTargets (superfine #5 is root-first) |
UnbundleCASAncestors (2-level)UnbundleCASLoop (3-level) | transaction_impl.h:1367-1379 | Within commit_child()cas_wrapper(anc, ... missing=TRUE) per ancestor. Child sub-slots kept (not nulled) |
UnbundleCASChild | transaction_impl.h:1383-1389 | Within commit_child()child → priority restoration with new payload |
Layer 2 characteristics: In the generated C11, all node linkage is packed into 64 bits,
and all fields are atomically updated with a single atomic_compare_exchange.
This is faithful to the "1 action = 1 step" principle of TLA+.