▲ Overview

Layer 2: Bundle/Unbundle + Commit (Base Spec)

TLA+ Model — BundleUnbundle_2level.tla / BundleUnbundle.tla (3-level)

This deck explains the Gen 1/2 (modular-serial) base specification. The current reference is Gen 3 LLfree — the variant that structurally proves livelock-freedom via a privileged-TID negotiate mechanism. The bundle/unbundle/commit protocol itself is identical in both generations, so this deck is still useful as a base explanation, but for current verification results and the canonical model see slides_layer2_LLfree.html.

Related decks: LLfree (Gen 3) / 3-level LLfree / dynamic LLfree / hard-link topologies (slides_hardlink_en.html — separate).

The most complex model. Integrated verification of 4-phase bundle, multi-level unbundle, and optimistic commit on 2-level and 3-level trees. The single-node CAS commit from the old "Layer 1 (stm_commit)" is now subsumed into this layer's CommitTryCAS direct path (2-layer architecture).

Model supports 3 fidelity tiers (coarse / fine / superfine — see next slide).

Tree Structure (Fixed)

Grand ── Parent ──┬── Child1
                         └── Child2

Data Structures

(* Packet: payload + child packets + missing flag *)
MakePacket(node, payload, sub, miss) ==
    [payload |-> payload, sub |-> sub, missing |-> miss, node |-> node]

(* PacketWrapper: priority (independent) or bundled (absorbed by parent) *)
PriorityWrapper(packet, ser) ==
    [packet |-> packet, hasPriority |-> TRUE, bundledBy |-> Null, serial |-> ser]

BundledRefWrapper(parentNode, ser) ==
    [packet |-> Null, hasPriority |-> FALSE, bundledBy |-> parentNode, serial |-> ser]

Variables (7)

VARIABLES
    serial,        \* per-thread serial counter
    globalSerial,  \* latest global serial
    linkage,       \* linkage[n]: PacketWrapper for each node
    pc, op, target, local  \* per-thread state

3-Tier Fidelity Model (coarse / fine / superfine)

This layer's TLA+ spec can verify the same invariants at different granularities. Granularity is selected via constant flags in the .cfg file. Finer modes increase state count but also raise fidelity to the C++ implementation.

ModeCharacteristicsUse case
coarsePhases collapsed into single atomic steps (fastest).
May miss some interleavings, but high-level invariants are still verifiable.
Smoke tests on large models (3-level, 3-thread, etc.)
finePer-child CAS with separate failure paths.
Phase3 rollback is omitted (matching C++).
Standard exhaustive verification
superfineFive C++-faithful optimization paths enabled:
#1 pre-bundle serial CAS, #2 inner-bundle phase split,
#4 Phase3 DISTURBED check, #5 root-first unbundle ordering,
#6 same-child retry
Strict correspondence with C++

An example of a bug introduced during TLA+ modeling: BundlePhase3's allDone check was only testing ~hasPriority, so it incorrectly counted children already bundled by another thread as "done." Fixed by matching against BundledRefWrapper(node, ser) for the current thread. The C++ implementation runs CAS on every child without a "skip if already bundled" optimization, so it never exhibited this bug — it was a TLA+ model-only issue.

TLA+ Notation (See Layer 1)

For basic TLA+ notation, refer to the beginning of the Layer 1 slides (slides_layer1_en.html).
New notation introduced in these slides:

NotationMeaning
MakePacket(node, payload, sub, miss)Record construction function. Returns [payload |-> ..., sub |-> ..., missing |-> ..., node |-> ...].
PriorityWrapper(packet, ser)PacketWrapper record. hasPriority=TRUE.
BundledRefWrapper(parentNode, ser)Wrapper in bundled state. hasPriority=FALSE, bundledBy=parentNode.
[n \in AllNodes |-> IF ... THEN ... ELSE ...]Conditional function definition. Sets different initial values per node.
r.fieldRecord field access. Examples: w.hasPriority, w.packet.sub[c]
% MaxSerialModular arithmetic. The serial wraps around at MaxSerial, making the state space structurally finite. This model does not use StateConstraint.
ModGT(a, b)Modular comparison. LET diff == (a - b + MaxSerial) % MaxSerial IN diff > 0 /\ diff < MaxSerial \div 2. Corresponds to signed difference comparison in C++.

Characteristics of this model: The variable local has a nested record structure (local[t].wrapper.packet.sub[c], etc.). This corresponds to nested structs in C++. In TLA+, nested fields are updated with [local EXCEPT ![t].wrapper = w].

Bundle: 4-Phase CAS Protocol

Phase 1: Collect — Gather child packets

BundlePhase1(t) ==
    /\ LET childPkts == [c \in children |->
           IF childWs[c].hasPriority /\ ~childWs[c].packet.missing
           THEN childWs[c].packet     \* child is complete → acquire
           ELSE IF childWs[c].bundledBy = node
           THEN parentW.packet.sub[c]  \* already bundled → use existing
           ELSE Null]                  \* cannot acquire → restart
       IN
       IF allCollected
       THEN pc' → "bundle_phase2"
       ELSE pc' → "snap_check"  \* restart

Phase 2: CAS Parent — New PacketWrapper (missing=TRUE)

BundlePhase2(t) ==
    /\ LET newPkt == MakePacket(node, payload, subs, TRUE)
           newW   == PriorityWrapper(newPkt, ser)
       IN
       IF linkage[node] = oldW    \* CAS: if unchanged
       THEN linkage' = [linkage EXCEPT ![node] = newW]

Phase 3: CAS Children — Replace with BundledRefWrapper

BundlePhase3(t) ==
    /\ IF allMatch   \* all children match expected values
       THEN linkage' = [n \in AllNodes |->
           IF n \in children
           THEN BundledRefWrapper(node, ser)  \* child → back-reference to parent
           ELSE linkage[n]]

Phase 4: Finalize — missing=FALSE

BundlePhase4(t) ==
    /\ LET finalPkt == MakePacket(node, payload, sub, FALSE)  \* ← FALSE!
       IN
       IF linkage[node] = oldW
       THEN linkage' = [linkage EXCEPT ![node] = finalW]
            \* → consistent snapshot is now possible

Unbundle: Disassembly at Commit Time

Why Is It Needed?

To commit to a bundled child node, the ancestor bundle chain must first be disassembled. Starting from the target child, we follow bundledBy upward to identify the ancestors, set each ancestor's missing to TRUE, and then restore the child to a priority wrapper.

Step 1: UnbundleWalk — Build ancestor list

UnbundleWalk(t) ==
    \* Starting from the target child, walk the bundledBy chain upward
    \* and build the list casTargets of ancestor nodes to CAS
    \* (child → parent → (at 3-level) grand)
    /\ local' = [local EXCEPT ![t].casTargets = chain]
    \* superfine #5: at 3-level, build the list root-first
    \* (matching the C++ traversal order: CAS from root → leaf)

Step 2: UnbundleCASAncestors / UnbundleCASLoop — Set missing=TRUE on ancestors

UnbundleCASAncestors(t)   \* 2-level (BundleUnbundle_2level.tla)
UnbundleCASLoop(t)         \* 3-level (BundleUnbundle.tla)
    \* For each ancestor in casTargets, CAS its wrapper to a new wrapper
    \* with missing=TRUE. The child sub-slots sub[c] are KEPT (not nulled),
    \* so Phase3 rollback / recovery paths still have the references available.
    /\ LET newPkt == MakePacket(anc, anc.payload, anc.sub, TRUE)
           newW   == PriorityWrapper(newPkt, ser)
       IN linkage' = [linkage EXCEPT ![anc] = newW]
    \* coarse: all ancestors in one step; fine: one ancestor per step

Step 3: UnbundleCASChild — Restore target child to priority

UnbundleCASChild(t) ==
    \* Restore the target child from BundledRefWrapper to PriorityWrapper,
    \* carrying the new (committed) payload.
    /\ LET newChildW == PriorityWrapper(newpacket, ser)
       IN linkage' = [linkage EXCEPT ![node] = newChildW]

Unbundle flow (shared across depths):
(1) UnbundleWalk builds the ancestor list casTargets → (2) CAS each ancestor to missing=TRUE (child slots preserved) → (3) CAS the target child back to Priority with the new payload.
Important: ancestor sub[c] slots are not nulled out. The missing=TRUE flag alone signals "this subtree is temporarily not snapshottable," and keeping sub intact allows Phase3 rollback or later recovery to find the references it needs.

superfine #5 (3-level only): casTargets is built in root-first order (grand → parent → child direction) and ancestor CAS also runs root-first, matching the C++ traversal. This faithfully reproduces the race cases against concurrent bundle/unbundle.

Commit: Direct Path and Parent/Grand-Scope Commit

Two Kinds of Commit

The model has two commit families:

CommitParent (2-level) / CommitGrand (3-level)

CommitParent(t)   \* 2-level: parent-scope +1 propagated to every Child payload
CommitGrand(t)    \* 3-level: grand-scope +1 propagated to every Grandchild payload
    \* Single CAS against the parent / grand wrapper. Updates every payload inside
    \* sub[c] to (p + 1) % MaxPayload and commits the whole scope in one action.
    \* Even bundled subtrees must see their internal payloads advance by +1.

Direct-commit pipeline (CommitRead → CommitTryCAS → CommitDone)

CommitRead(t) ==
    \* Reload linkage[node] immediately before attempting the CAS
    /\ local' = [local EXCEPT ![t].wrapper = linkage[target[t]]]

CommitTryCAS(t) ==
    IF linkage[node] = oldW
    THEN linkage' = [linkage EXCEPT ![node] = newW]   \* CAS success → commit_done
    ELSE IF linkage[node].hasPriority
    THEN
        IF linkage[node].packet.payload = oldW.packet.payload
        THEN \* ★ Single-node optimization: same payload → re-adopt new child and retry ★
             local' = [local EXCEPT ![t].wrapper = linkage[node], ...]
        ELSE \* True conflict → fail
    ELSE
        \* Bundled → transition to unbundle_walk
        pc' → "unbundle_walk"

CommitDone(t) ==
    \* Commit finalization: decrement iterBudget, pop next child from childQueue
    \* If the queue is empty, return to idle; if iterBudget=0 as well, move to terminal
    /\ iterBudget' = [iterBudget EXCEPT ![t] = @ - 1]

Lifecycle variables and terminal condition

iterBudget / childQueue / MaxCommits: each thread runs MaxCommits iterations. Each iteration consists of "one parent / grand-scope commit (CommitParent / CommitGrand) plus one direct commit per child (CommitRead / TryCAS / Done)." childQueue[t] holds the children still pending in the current iteration and iterBudget[t] the remaining iterations.

TerminalPayloadCheck: in the terminal state where all threads have iterBudget=0 and are idle, each direct child's payload must equal (2 × MaxCommits × |Threads|) % MaxPayload. This guarantees that both the deep (+1) and direct (+1) commit streams were applied correctly.

Single-node optimization: On CAS failure, if linkage[node].hasPriority is still true and the payload matches our expected value, treat it as "only the child slot was swapped by another thread," adopt the new child packet, and retry the CAS. This reduces wasted aborts.

Verified Invariants (5 items)

\* 1. If missing=FALSE then all child packets exist
SnapshotConsistency ==
    \A n \in InnerNodes :
        (w.hasPriority /\ ~w.packet.missing) =>
            (\A c \in ChildrenOf(n) : w.packet.sub[c] /= Null)

\* 2. Non-root nodes have priority OR bundledBy≠Null
NoPriorityLoss ==
    \A n \in AllNodes \ {Grand} :
        w.hasPriority \/ w.bundledBy /= Null

\* 3. bundledBy chains are valid (reach priority within 2 hops)
BundleChainValid ==
    \A n \in AllNodes \ {Grand} :
        (~w.hasPriority /\ w.bundledBy /= Null) =>
            pw.hasPriority \/ pw.bundledBy /= Null

\* 4. bundledBy always points to the structural parent
BundledByCorrect ==
    \A n \in AllNodes \ {Grand} :
        ~w.hasPriority => w.bundledBy = ParentOf(n)

\* 5. Root always has priority
GrandAlwaysPriority ==
    linkage[Grand].hasPriority

Verification Results (Gen 3 LLfree, CONSTRAINT-free exhaustion)

ModelFidelityThreadsStatesDepthTimeResult
2-level LLfreecoarse2665,2188928s✅ Pass + liveness
2-level LLfreesuperfine22,676,1961293:12✅ Pass + liveness
2-level LLfreesuperfine confC (all-root)3137,333,348966:35✅ Pass (ohtaka)
2-level LLfreeMaxCommits=2 superfine2127,586,5993114:40✅ Pass (ohtaka)
2-level dynamic LLfreerelease superfine live2413,884,5163207:13✅ Pass + liveness (ohtaka)
3-level LLfreecoarse21,497,098981:35✅ Pass + liveness
3-level LLfreesuperfine214,109,73114819:13✅ Pass + liveness
3-level LLfreesuperfine confC (all-root)3640,894,9518815:25✅ Pass + liveness (ohtaka)
Old Gen 1/2 (modular serial) numbers (906K, 3.2M, ...) have been removed from this table. The LLfree spec replaces modular wrapping with a priority-TID negotiate mechanism that makes the state space structurally finite and verifies liveness at the same time. Details: VERIFICATION.md §3, §4.

What is covered: Across all 2–3 thread interleavings of snapshot/commit/bundle/unbundle, all invariants are preserved — proven by TLC exhaustion. Coarse collapses phases into single steps (fastest); fine separates per-child CAS; superfine faithfully models the C++ optimization paths. Gen 3 (LLfree) replaces modular wrapping with the priority-TID mechanism, finitizing the state space structurally so that safety + liveness (EventuallyAllDone) are verified at the same time.

Bug found in TLA+: In the fine granularity, BundlePhase3's allDone test was only checking ~hasPriority, so it incorrectly counted children already bundled by another thread as "done." Fixed by matching against BundledRefWrapper(node, ser) for the current thread. (The C++ implementation runs CAS on every child without a "skip if already bundled" optimization, so it never exhibited the bug.)

Phase 4 reachability gate (hard-link safeguard; no impact on this spec)

Commits 87892b35 / 92b15f62 / 404fa137 / b12e1895 added a reachability gate (Packet::allSubReachable) to bundle Phase 4: under is_bundle_root=true, before m_missing is cleared, every Null sub-slot must be reverseLookup-able from the global root, otherwise Phase 4 returns BundledStatus::DISTURBED and the outer snapshot() retries.

This 2-level / 3-level spec needs no update. In the standard non-hard-link topology, every child has exactly one parent, so any Null sub-slot encountered at Phase 4 is always reverseLookup-able. The existing SnapshotConsistency invariant — one of this spec's five — already captures the gate's precondition. The gate is only effectful in hard-link topologies, covered separately by the BundleUnbundle_hardlink_* family (see slides_hardlink_en.html).

Two-Layer Overview (Current Status)

LayerSpecWhat It VerifiesStates (largest confirmed)
Layer 1atomic_shared_ptr.tla
+ commit primitives
Tagged-pointer reference counting, drain release, scoped_atomic_view; safety only66.5M (2-thread MaxOps=3 full)
Layer 2 (LLfree)BundleUnbundle_2level_LLfree
BundleUnbundle_3level_LLfree
+ _dynamic variants
4-phase bundle / unbundle / commit + LL-free priority gating; safety + liveness2L: 413M (dynamic release superfine)
3L: 640M (3-thread superfine confC)
Layer 2 (hard-link)BundleUnbundle_hardlink_*
(5 topologies)
Multi-parent / 1-child; Phase 4 reachability gate; bug repro + fix; per-action fairness~1k–10k (small-scale, complete)

Why only 2 layers: The former "Layer 1 (stm_commit)" single-node CAS commit integrates naturally into Layer 2's CommitTryCAS direct path, so a separate layer was redundant. Verifying bundle/unbundle together with optimistic commit in the same spec improves coverage of interference patterns.

Why Gen 3 (LLfree): The old Gen 1 (modular serial) had no structural guarantee of livelock-freedom; Gen 2 (Nat + CONSTRAINT) relied on an artificial search cutoff. Gen 3 LLfree introduces priorityTag + PreemptTag + ClearMyTags to structure older-wins arbitration, letting TLC exhaust without CONSTRAINT and structurally prove liveness (details: doc/proof_semantics.md).

Related Decks

C++ ↔ TLA+ ↔ Generated C11 Correspondence Table

Variable Mapping

TLA+ VariableC++ (transaction_impl.h)Generated C11 (test_bundle_3level.c)
linkage[n]atomic_shared_ptr<PacketWrapper> n->m_link_Atomic(uint64_t) linkage[4]
64-bit packed (Wrapper struct)
linkage[n].hasPriorityPacketWrapper::hasPriority()Wrapper.has_priority (bit field)
linkage[n].bundledByPacketWrapper::bundledBy()Wrapper.bundled_by
linkage[n].packet.payloadPacket::payload()Wrapper.payload
linkage[n].packet.sub[c]Packet::subpackets()[i]Wrapper.sub_child1/2
serial[t]SerialGenerator (Lamport clock)uint8_t thread_ser (local)
ModGT(a,b)Unsigned subtraction → signed reinterpretationmod_gt(a,b): (a-b+MAX)%MAX < MAX/2

Action Mapping

TLA+ ActionC++ CodeGenerated C11 Function
SnapRead + SnapCheck
(3-level uses only SnapCheck)
snapshot()
transaction_impl.h:842-870
snapshot_grand()
load_wrapper() → priority/missing decision
BundlePhase14bundle()
transaction_impl.h:1077-1171
Within snapshot_grand()
collect → CAS parent → CAS children → finalize
InnerPhase2/3/4
(3-level only)
Inner-bundle portion of bundle()
transaction_impl.h:1100-1171
Within snapshot_grand()
Inner bundle on Parent when Grand bundles (explicit in superfine; collapsed into BundlePhase1 in fine)
CommitParent (2-level)
CommitGrand (3-level)
Parent / grand-scope commit()
transaction_impl.h:1245-1270
commit_parent() / commit_grand()
Deep commit: +1 to every child payload in the scope
CommitReadReload just before CAS in commit()
transaction_impl.h:1245-1260
Within commit_child()
load_wrapper immediately before the CAS attempt
CommitTryCAScommit()
transaction_impl.h:1245-1270
commit_child()
Direct CAS + single-node optimization
CommitDoneFinalization tail of commit()
transaction_impl.h:1265-1275
Tail of commit_child()
Decrement iterBudget, advance to next child or idle
UnbundleWalksnapshotSupernode()
transaction_impl.h:696-755
Within commit_child()
Walk bundledBy chain to build casTargets (superfine #5 is root-first)
UnbundleCASAncestors (2-level)
UnbundleCASLoop (3-level)
transaction_impl.h:1367-1379Within commit_child()
cas_wrapper(anc, ... missing=TRUE) per ancestor. Child sub-slots kept (not nulled)
UnbundleCASChildtransaction_impl.h:1383-1389Within commit_child()
child → priority restoration with new payload

Layer 2 characteristics: In the generated C11, all node linkage is packed into 64 bits, and all fields are atomically updated with a single atomic_compare_exchange. This is faithful to the "1 action = 1 step" principle of TLA+.