This deck explains the Gen 1/2 (modular-serial) base specification.
The current reference is Gen 3 LLfree — the variant that structurally proves
livelock-freedom via a privileged-TID negotiate mechanism. The bundle/unbundle/commit
protocol itself is identical in both generations, so this deck is still useful as a base
explanation, but for current verification results and the canonical model see
slides_layer2_LLfree.html.
Related decks: LLfree (Gen 3) / 3-level LLfree / dynamic LLfree / hard-link topologies (slides_hardlink_en.html — separate).
The most complex model. Integrated verification of 4-phase bundle, multi-level unbundle,
and optimistic commit on 2-level and 3-level trees. The single-node CAS commit from the old "Layer 1
(stm_commit)" is now subsumed into this layer's CommitTryCAS direct path
(2-layer architecture).
Model supports 3 fidelity tiers (coarse / fine / superfine — see next slide).
(* Packet: payload + child packets + missing flag *)
MakePacket(node, payload, sub, miss) ==
[payload |-> payload, sub |-> sub, missing |-> miss, node |-> node]
(* PacketWrapper: priority (independent) or bundled (absorbed by parent) *)
PriorityWrapper(packet, ser) ==
[packet |-> packet, hasPriority |-> TRUE, bundledBy |-> Null, serial |-> ser]
BundledRefWrapper(parentNode, ser) ==
[packet |-> Null, hasPriority |-> FALSE, bundledBy |-> parentNode, serial |-> ser]
VARIABLES
serial, \* per-thread serial counter
globalSerial, \* latest global serial
linkage, \* linkage[n]: PacketWrapper for each node
pc, op, target, local \* per-thread state
This layer's TLA+ spec can verify the same invariants at different granularities.
Granularity is selected via constant flags in the .cfg file. Finer modes increase state
count but also raise fidelity to the C++ implementation.
| Mode | Characteristics | Use case |
|---|---|---|
| coarse | Phases collapsed into single atomic steps (fastest). May miss some interleavings, but high-level invariants are still verifiable. | Smoke tests on large models (3-level, 3-thread, etc.) |
| fine | Per-child CAS with separate failure paths. Phase3 rollback is omitted (matching C++). | Standard exhaustive verification |
| superfine | Five C++-faithful optimization paths enabled: #1 pre-bundle serial CAS, #2 inner-bundle phase split, #4 Phase3 DISTURBED check, #5 root-first unbundle ordering, #6 same-child retry | Strict correspondence with C++ |
An example of a bug introduced during TLA+ modeling: BundlePhase3's
allDone check was only testing ~hasPriority, so it incorrectly
counted children already bundled by another thread as "done." Fixed by matching against
BundledRefWrapper(node, ser) for the current thread. The C++ implementation
runs CAS on every child without a "skip if already bundled" optimization, so it never
exhibited this bug — it was a TLA+ model-only issue.
For basic TLA+ notation, refer to the beginning of the Layer 1 slides (slides_layer1_en.html).
New notation introduced in these slides:
| Notation | Meaning |
|---|---|
MakePacket(node, payload, sub, miss) | Record construction function. Returns [payload |-> ..., sub |-> ..., missing |-> ..., node |-> ...]. |
PriorityWrapper(packet, ser) | PacketWrapper record. hasPriority=TRUE. |
BundledRefWrapper(parentNode, ser) | Wrapper in bundled state. hasPriority=FALSE, bundledBy=parentNode. |
[n \in AllNodes |-> IF ... THEN ... ELSE ...] | Conditional function definition. Sets different initial values per node. |
r.field | Record field access. Examples: w.hasPriority, w.packet.sub[c] |
% MaxSerial | Modular arithmetic. The serial wraps around at MaxSerial, making the state space structurally finite. This model does not use StateConstraint. |
ModGT(a, b) | Modular comparison. LET diff == (a - b + MaxSerial) % MaxSerial IN diff > 0 /\ diff < MaxSerial \div 2. Corresponds to signed difference comparison in C++. |
Characteristics of this model: The variable local has a nested record structure (local[t].wrapper.packet.sub[c], etc.).
This corresponds to nested structs in C++. In TLA+, nested fields are updated with [local EXCEPT ![t].wrapper = w].
BundlePhase1(t) ==
/\ LET childPkts == [c \in children |->
IF childWs[c].hasPriority /\ ~childWs[c].packet.missing
THEN childWs[c].packet \* child is complete → acquire
ELSE IF childWs[c].bundledBy = node
THEN parentW.packet.sub[c] \* already bundled → use existing
ELSE Null] \* cannot acquire → restart
IN
IF allCollected
THEN pc' → "bundle_phase2"
ELSE pc' → "snap_check" \* restart
BundlePhase2(t) ==
/\ LET newPkt == MakePacket(node, payload, subs, TRUE)
newW == PriorityWrapper(newPkt, ser)
IN
IF linkage[node] = oldW \* CAS: if unchanged
THEN linkage' = [linkage EXCEPT ![node] = newW]
BundlePhase3(t) ==
/\ IF allMatch \* all children match expected values
THEN linkage' = [n \in AllNodes |->
IF n \in children
THEN BundledRefWrapper(node, ser) \* child → back-reference to parent
ELSE linkage[n]]
BundlePhase4(t) ==
/\ LET finalPkt == MakePacket(node, payload, sub, FALSE) \* ← FALSE!
IN
IF linkage[node] = oldW
THEN linkage' = [linkage EXCEPT ![node] = finalW]
\* → consistent snapshot is now possible
To commit to a bundled child node, the ancestor bundle chain must first be disassembled.
Starting from the target child, we follow bundledBy upward to identify the
ancestors, set each ancestor's missing to TRUE, and then restore the child
to a priority wrapper.
UnbundleWalk(t) ==
\* Starting from the target child, walk the bundledBy chain upward
\* and build the list casTargets of ancestor nodes to CAS
\* (child → parent → (at 3-level) grand)
/\ local' = [local EXCEPT ![t].casTargets = chain]
\* superfine #5: at 3-level, build the list root-first
\* (matching the C++ traversal order: CAS from root → leaf)
UnbundleCASAncestors(t) \* 2-level (BundleUnbundle_2level.tla)
UnbundleCASLoop(t) \* 3-level (BundleUnbundle.tla)
\* For each ancestor in casTargets, CAS its wrapper to a new wrapper
\* with missing=TRUE. The child sub-slots sub[c] are KEPT (not nulled),
\* so Phase3 rollback / recovery paths still have the references available.
/\ LET newPkt == MakePacket(anc, anc.payload, anc.sub, TRUE)
newW == PriorityWrapper(newPkt, ser)
IN linkage' = [linkage EXCEPT ![anc] = newW]
\* coarse: all ancestors in one step; fine: one ancestor per step
UnbundleCASChild(t) ==
\* Restore the target child from BundledRefWrapper to PriorityWrapper,
\* carrying the new (committed) payload.
/\ LET newChildW == PriorityWrapper(newpacket, ser)
IN linkage' = [linkage EXCEPT ![node] = newChildW]
Unbundle flow (shared across depths):
(1) UnbundleWalk builds the ancestor list casTargets → (2) CAS each ancestor to missing=TRUE (child slots preserved) → (3) CAS the target child back to Priority with the new payload.
Important: ancestor sub[c] slots are not nulled out. The missing=TRUE flag alone signals "this subtree is temporarily not snapshottable," and keeping sub intact allows Phase3 rollback or later recovery to find the references it needs.
superfine #5 (3-level only): casTargets is built in root-first order
(grand → parent → child direction) and ancestor CAS also runs root-first, matching the C++
traversal. This faithfully reproduces the race cases against concurrent bundle/unbundle.
The model has two commit families:
CommitRead → CommitTryCAS → CommitDone.
Performs the ordinary +1 to a child payload, inserting an unbundle when the target is bundled.CommitParent (2-level) / CommitGrand (3-level).
Takes a snapshot at parent / grand scope and applies +1 to every child payload inside that scope.
This stresses whether payload integrity is preserved through bundle/unbundle chains.CommitParent(t) \* 2-level: parent-scope +1 propagated to every Child payload
CommitGrand(t) \* 3-level: grand-scope +1 propagated to every Grandchild payload
\* Single CAS against the parent / grand wrapper. Updates every payload inside
\* sub[c] to (p + 1) % MaxPayload and commits the whole scope in one action.
\* Even bundled subtrees must see their internal payloads advance by +1.
CommitRead(t) ==
\* Reload linkage[node] immediately before attempting the CAS
/\ local' = [local EXCEPT ![t].wrapper = linkage[target[t]]]
CommitTryCAS(t) ==
IF linkage[node] = oldW
THEN linkage' = [linkage EXCEPT ![node] = newW] \* CAS success → commit_done
ELSE IF linkage[node].hasPriority
THEN
IF linkage[node].packet.payload = oldW.packet.payload
THEN \* ★ Single-node optimization: same payload → re-adopt new child and retry ★
local' = [local EXCEPT ![t].wrapper = linkage[node], ...]
ELSE \* True conflict → fail
ELSE
\* Bundled → transition to unbundle_walk
pc' → "unbundle_walk"
CommitDone(t) ==
\* Commit finalization: decrement iterBudget, pop next child from childQueue
\* If the queue is empty, return to idle; if iterBudget=0 as well, move to terminal
/\ iterBudget' = [iterBudget EXCEPT ![t] = @ - 1]
iterBudget / childQueue / MaxCommits:
each thread runs MaxCommits iterations. Each iteration consists of
"one parent / grand-scope commit (CommitParent / CommitGrand) plus
one direct commit per child (CommitRead / TryCAS / Done)."
childQueue[t] holds the children still pending in the current iteration and
iterBudget[t] the remaining iterations.
TerminalPayloadCheck: in the terminal state where all threads have
iterBudget=0 and are idle, each direct child's payload must equal
(2 × MaxCommits × |Threads|) % MaxPayload.
This guarantees that both the deep (+1) and direct (+1) commit streams were applied correctly.
Single-node optimization: On CAS failure, if linkage[node].hasPriority is still
true and the payload matches our expected value, treat it as "only the child slot was swapped by
another thread," adopt the new child packet, and retry the CAS. This reduces wasted aborts.
\* 1. If missing=FALSE then all child packets exist
SnapshotConsistency ==
\A n \in InnerNodes :
(w.hasPriority /\ ~w.packet.missing) =>
(\A c \in ChildrenOf(n) : w.packet.sub[c] /= Null)
\* 2. Non-root nodes have priority OR bundledBy≠Null
NoPriorityLoss ==
\A n \in AllNodes \ {Grand} :
w.hasPriority \/ w.bundledBy /= Null
\* 3. bundledBy chains are valid (reach priority within 2 hops)
BundleChainValid ==
\A n \in AllNodes \ {Grand} :
(~w.hasPriority /\ w.bundledBy /= Null) =>
pw.hasPriority \/ pw.bundledBy /= Null
\* 4. bundledBy always points to the structural parent
BundledByCorrect ==
\A n \in AllNodes \ {Grand} :
~w.hasPriority => w.bundledBy = ParentOf(n)
\* 5. Root always has priority
GrandAlwaysPriority ==
linkage[Grand].hasPriority
| Model | Fidelity | Threads | States | Depth | Time | Result |
|---|---|---|---|---|---|---|
| 2-level LLfree | coarse | 2 | 665,218 | 89 | 28s | ✅ Pass + liveness |
| 2-level LLfree | superfine | 2 | 2,676,196 | 129 | 3:12 | ✅ Pass + liveness |
| 2-level LLfree | superfine confC (all-root) | 3 | 137,333,348 | 96 | 6:35 | ✅ Pass (ohtaka) |
| 2-level LLfree | MaxCommits=2 superfine | 2 | 127,586,599 | 311 | 4:40 | ✅ Pass (ohtaka) |
| 2-level dynamic LLfree | release superfine live | 2 | 413,884,516 | 320 | 7:13 | ✅ Pass + liveness (ohtaka) |
| 3-level LLfree | coarse | 2 | 1,497,098 | 98 | 1:35 | ✅ Pass + liveness |
| 3-level LLfree | superfine | 2 | 14,109,731 | 148 | 19:13 | ✅ Pass + liveness |
| 3-level LLfree | superfine confC (all-root) | 3 | 640,894,951 | 88 | 15:25 | ✅ Pass + liveness (ohtaka) |
Old Gen 1/2 (modular serial) numbers (906K, 3.2M, ...) have been removed from this table. The LLfree spec replaces modular wrapping with a priority-TID negotiate mechanism that makes the state space structurally finite and verifies liveness at the same time. Details: VERIFICATION.md §3, §4. | ||||||
What is covered: Across all 2–3 thread interleavings of
snapshot/commit/bundle/unbundle, all invariants are preserved — proven by TLC exhaustion.
Coarse collapses phases into single steps (fastest); fine separates per-child CAS;
superfine faithfully models the C++ optimization paths.
Gen 3 (LLfree) replaces modular wrapping with the priority-TID mechanism, finitizing
the state space structurally so that safety + liveness
(EventuallyAllDone) are verified at the same time.
Bug found in TLA+: In the fine granularity, BundlePhase3's
allDone test was only checking ~hasPriority, so it incorrectly
counted children already bundled by another thread as "done." Fixed by matching against
BundledRefWrapper(node, ser) for the current thread. (The C++ implementation
runs CAS on every child without a "skip if already bundled" optimization, so it never
exhibited the bug.)
Commits 87892b35 / 92b15f62 / 404fa137 / b12e1895
added a reachability gate (Packet::allSubReachable) to bundle Phase 4:
under is_bundle_root=true, before m_missing is cleared, every Null sub-slot
must be reverseLookup-able from the global root, otherwise Phase 4 returns
BundledStatus::DISTURBED and the outer snapshot() retries.
This 2-level / 3-level spec needs no update.
In the standard non-hard-link topology, every child has exactly one parent, so any Null sub-slot
encountered at Phase 4 is always reverseLookup-able. The existing
SnapshotConsistency invariant — one of this spec's five — already captures the gate's
precondition. The gate is only effectful in hard-link topologies, covered separately by the
BundleUnbundle_hardlink_* family (see slides_hardlink_en.html).
| Layer | Spec | What It Verifies | States (largest confirmed) |
|---|---|---|---|
| Layer 1 | atomic_shared_ptr.tla+ commit primitives | Tagged-pointer reference counting, drain release, scoped_atomic_view; safety only | 66.5M (2-thread MaxOps=3 full) |
| Layer 2 (LLfree) | BundleUnbundle_2level_LLfreeBundleUnbundle_3level_LLfree+ _dynamic variants | 4-phase bundle / unbundle / commit + LL-free priority gating; safety + liveness | 2L: 413M (dynamic release superfine) 3L: 640M (3-thread superfine confC) |
| Layer 2 (hard-link) | BundleUnbundle_hardlink_*(5 topologies) | Multi-parent / 1-child; Phase 4 reachability gate; bug repro + fix; per-action fairness | ~1k–10k (small-scale, complete) |
Why only 2 layers: The former "Layer 1 (stm_commit)" single-node CAS commit
integrates naturally into Layer 2's CommitTryCAS direct path, so a separate layer
was redundant. Verifying bundle/unbundle together with optimistic commit in the same spec
improves coverage of interference patterns.
Why Gen 3 (LLfree): The old Gen 1 (modular serial) had no structural
guarantee of livelock-freedom; Gen 2 (Nat + CONSTRAINT) relied on an artificial search cutoff.
Gen 3 LLfree introduces priorityTag + PreemptTag + ClearMyTags
to structure older-wins arbitration, letting TLC exhaust without CONSTRAINT and structurally
prove liveness (details: doc/proof_semantics.md).
slides_layer2_LLfree.html (Gen 3 LLfree core)slides_hardlink_en.html (5 topologies + Phase 4 reachability gate)slides_layer2_LLfree_3level_en.html / slides_layer2_LLfree_dynamic_en.html| TLA+ Variable | C++ (transaction_impl.h) | Generated C11 (test_bundle_3level.c) |
|---|---|---|
linkage[n] | atomic_shared_ptr<PacketWrapper> n->m_link | _Atomic(uint64_t) linkage[4]64-bit packed (Wrapper struct) |
linkage[n].hasPriority | PacketWrapper::hasPriority() | Wrapper.has_priority (bit field) |
linkage[n].bundledBy | PacketWrapper::bundledBy() | Wrapper.bundled_by |
linkage[n].packet.payload | Packet::payload() | Wrapper.payload |
linkage[n].packet.sub[c] | Packet::subpackets()[i] | Wrapper.sub_child1/2 |
serial[t] | SerialGenerator (Lamport clock) | uint8_t thread_ser (local) |
ModGT(a,b) | Unsigned subtraction → signed reinterpretation | mod_gt(a,b): (a-b+MAX)%MAX < MAX/2 |
| TLA+ Action | C++ Code | Generated C11 Function |
|---|---|---|
SnapRead + SnapCheck(3-level uses only SnapCheck) | snapshot()transaction_impl.h:842-870 | snapshot_grand()load_wrapper() → priority/missing decision |
BundlePhase1–4 | bundle()transaction_impl.h:1077-1171 | Within snapshot_grand()collect → CAS parent → CAS children → finalize |
InnerPhase2/3/4(3-level only) | Inner-bundle portion of bundle()transaction_impl.h:1100-1171 | Within snapshot_grand()Inner bundle on Parent when Grand bundles (explicit in superfine; collapsed into BundlePhase1 in fine) |
CommitParent (2-level)CommitGrand (3-level) | Parent / grand-scope commit()transaction_impl.h:1245-1270 | commit_parent() / commit_grand()Deep commit: +1 to every child payload in the scope |
CommitRead | Reload just before CAS in commit()transaction_impl.h:1245-1260 | Within commit_child()load_wrapper immediately before the CAS attempt |
CommitTryCAS | commit()transaction_impl.h:1245-1270 | commit_child()Direct CAS + single-node optimization |
CommitDone | Finalization tail of commit()transaction_impl.h:1265-1275 | Tail of commit_child()Decrement iterBudget, advance to next child or idle |
UnbundleWalk | snapshotSupernode()transaction_impl.h:696-755 | Within commit_child()Walk bundledBy chain to build casTargets (superfine #5 is root-first) |
UnbundleCASAncestors (2-level)UnbundleCASLoop (3-level) | transaction_impl.h:1367-1379 | Within commit_child()cas_wrapper(anc, ... missing=TRUE) per ancestor. Child sub-slots kept (not nulled) |
UnbundleCASChild | transaction_impl.h:1383-1389 | Within commit_child()child → priority restoration with new payload |
Layer 2 characteristics: In the generated C11, all node linkage is packed into 64 bits,
and all fields are atomically updated with a single atomic_compare_exchange.
This is faithful to the "1 action = 1 step" principle of TLA+.