Provenance
Table of contents
A compliance primitive: an append-only chain recording the complete custody history of a single artifact — where it originated, who has held it, what transformations have been applied, and to whom it has been disclosed. Each chain has an opaque (system-generated, with no meaningful content) immutable (unchangeable once written)
chain_id; theartifact_refis an opaque, immutable property set at genesis. Each entry in the chain has an opaque immutableentry_idand a strictly increasingsequence_numberthat is the order source (clock-independent). The load-bearing guarantee is custody continuity — from genesis to terminal disposition, the chain has exactly one current custodian at all times and the record of each transfer is hand-to-hand, with no gap.
Intent
Regulated industries demand an unbroken account of every artifact’s journey. A pharmaceutical sample must travel from manufacturer to pharmacist without a custody gap; a piece of physical evidence must traverse from crime scene to courtroom without any unattributed hand-off; a financial instrument must carry a ledger of every holder and transformation it has undergone. The shape is constant across domains: something exists, someone holds it, it may change hands, it may be altered, and at every point in its history the answer to “who is holding this right now, and what has been done to it?” must come from the records alone — not from anyone’s recollection, and not from the system operator’s assertion.
The Provenance atom addresses this requirement. It records the origin, custody history, and transformation history of one artifact as an append-only chain of entries, with a current custodian maintained continuously from genesis to terminal disposition. The fundamental guarantee the atom enforces is custody continuity: there is no point in the chain’s life at which custody is held by nobody or by two parties simultaneously. Every transfer is hand-to-hand — the outgoing custodian’s identity is read from the chain’s own state, never supplied by the caller — so no transfer can manufacture a false predecessor. Every transformation and disclosure is guarded: only the current custodian can record them.
This is a freestanding (can be specified without naming any other pattern) atom in the EOS (Essence of Software — Daniel Jackson’s framework for specifying software concepts as freestanding, composable units) sense. It has its own state (the chain and its entries), its own actions (originate, transfer, transform, disclose, archive, read), and its own operational principles (entries are immutable once recorded; the chain is append-only; custody is continuous and unambiguous at all times).
The EOS boundary against Event Log. Provenance is not an Event Log of custody events, and this distinction is load-bearing. Event Log (see atoms/event-log.md) is a content-agnostic stream of system events with no subject, no custodian, no continuity guarantee, and no prohibition on sequence gaps. Event Log explicitly permits sequence gaps — a storage-failure consumes a sequence number and the next successful append receives a strictly higher one; consumers must not assume a dense sequence. Provenance, by contrast, is anchored to one artifact, maintains exactly one current custodian continuously, and its load-bearing Invariant 4 (custody continuity) is a property Event Log cannot express: custody continuity is not a property of a content-agnostic event stream but of a subject-scoped chain where every entry is attributed to the then-current holder. Likely objection: “Why not just build Provenance as a specialized Event Log with a custodian field?” The mechanism that resolves it: Provenance’s transfer action reads the outgoing custodian from the chain’s own state (not from a caller-supplied field), making it structurally impossible to record a transfer from someone who was not the current custodian; Event Log has no subject state, no current-custodian projection, and no such guard. Result: custody continuity is an invariant of the Provenance chain that no amount of Event Log wrapping can provide, because the chain’s own state is what enforces it — and that state is Provenance’s own, not borrowed from another concept.
The atom does not implement non-repudiable custodian identity, cryptographic chain integrity, retention and defensible disposal, or the full chain-of-custody attribution+tamper+retention surface. Each is a separate composable concept; see Composition notes and Edge cases.
The linear single-artifact constraint. This atom is deliberately a linear single-artifact custody chain — one chain tracks one artifact from origin to terminal disposition, with a single current custodian at every point. It does not model artifact splitting (one pharmaceutical sample aliquoted into several), DAG-style derivation (one artifact derived into many related artifacts), W3C PROV (Provenance Data Model — a W3C (World Wide Web Consortium) standard for representing provenance as a directed acyclic graph of entities, activities, and agents) wasDerivedFrom relationships, or multi-party simultaneous custody. These are explicit non-goals; see Edge cases. The linearity constraint is what makes custody continuity a tractable invariant — in a DAG model, “current custodian” is not well-defined — and the single-artifact constraint is what makes the chain’s subject unambiguous. The design mirrors atoms/clinical-observation.md’s explicit scoping of the linear single-chain model and its equally explicit non-goal of branching.
Summary
Provenance answers the chain-of-custody question: where did this artifact come from, who has held it, what has been done to it, and who holds it right now? It works by keeping an append-only record — a chain of entries — for a single artifact (a pharmaceutical sample, a digital file, a piece of physical evidence, a legal document) from the moment the artifact enters the system until it reaches a final terminal state called Archived. Each entry records one custody event: the artifact was originated (created under custody), transferred from one custodian to another, transformed in some way, disclosed to a recipient, or finally archived. Entries can only be added, never removed or changed. The chain always has exactly one current holder (called the custodian), and the only way to change who that holder is uses a hand-to-hand transfer — the system reads the current holder from its own records and records the transfer as coming from that person, so it is structurally impossible to forge a custody gap or a false predecessor. A transfer, a transformation, and a disclosure can only be recorded by the person the chain currently shows as holding the artifact. Every entry carries a sequence number that goes up by one for each new entry; this sequence number is the authoritative order source, kept separate from the human-readable timestamp so that the chain can be replayed correctly even when clocks drift. This is the mechanism behind pharmaceutical chain of custody, physical-evidence chains for courts, financial instrument custody records, and controlled-substance tracking.
Structure
Identity model
Each Provenance chain has a chain_id — an opaque, immutable, system-generated identifier produced by originate. The chain_id is the chain’s identity. The artifact_ref is an opaque, immutable property of the chain set at genesis; it is never the chain’s identity. Two chains for different artifacts have different chain_ids; a single artifact has exactly one chain over its lifetime.
The opaque-id model matters here for the same reason it matters throughout the compliance atoms: identifying a chain by artifact_ref would make it impossible to distinguish chains that track different handling episodes of artifacts with the same external identifier (returned-and-reprocessed pharmaceutical batches, reintroduced evidence, reissued instruments), and it would make the chain itself dependent on the semantics of the artifact reference, which belongs to an external layer. chain_id is the atom’s own stable identity; artifact_ref is opaque.
Each custody entry has an entry_id — opaque, immutable, system-generated, unique within the chain and never reused. Each entry also has a sequence_number — a strictly increasing integer assigned at append, the order source within the chain. The sequence_number is clock-independent: even if the wall-time clock skips or moves backward, the sequence is authoritative. This mirrors the discipline of Event Log’s sequence_number exactly — recorded_at is a best-effort wall-time annotation, not the source of ordering.
The chain itself maintains a per-instance next_sequence_number counter, beginning at 1 for a fresh chain instance and incrementing by 1 on each successful entry write. This counter is part of the chain’s persistent state and must be preserved across restarts by durable implementations. Volatile implementations that reset to 1 on restart violate the strictly-increasing invariant across the lifetime of the chain.
Store instances. Each Provenance chain lives in a named store instance; multiple chains coexist in one store (one per artifact), and multiple store instances coexist in real deployments (one per facility, jurisdiction, custody domain, or business unit). chain_id values are unique within a store instance; uniqueness across instances is a composing concept. artifact_ref is scoped to the host system — the same artifact_ref may appear in different store instances for genuinely different artifacts, or for the same artifact tracked in two custody domains. A call implicitly targets a single routed store instance; the mechanism by which a call reaches a specific instance (service binding, namespace prefix, endpoint) is resolved at the deployment-routing layer, not defined by this atom.
Inputs
- An artifact reference identifying what is being tracked. The atom treats this as opaque — the host system defines what an artifact is and how to reference it.
- Custodian references identifying who is holding the artifact. Also opaque — the identity registry belongs to a separate layer. Non-repudiable attestation of custodian identity composes with Actor Identity; the atom records
custodian_refvalues and enforces structural continuity without itself verifying them. - Actions:
originate(artifact_ref, custodian_ref, genesis_type, [metadata]) → chain_id | rejected(invalid-ref | invalid-genesis-type | storage-failure)transfer(chain_id, to_custodian_ref) → entry_id | rejected(not-known | archived | invalid-ref | storage-failure)transform(chain_id, custodian_ref, transformation_descriptor) → entry_id | rejected(not-known | archived | not-current-custodian | invalid-ref | invalid-descriptor | storage-failure)disclose(chain_id, custodian_ref, recipient_ref) → entry_id | rejected(not-known | archived | not-current-custodian | invalid-ref | storage-failure)archive(chain_id, custodian_ref) → entry_id | rejected(not-known | already-archived | not-current-custodian | storage-failure)read(chain_id, [query]) → ordered_sequence_of_entries | rejected(not-known | invalid-query)
- An implicit clock providing wall-time timestamps (best-effort; not the order source).
Outputs
- For
originate: a freshchain_id, or a rejection naming the failed precondition. - For
transfer,transform,disclose,archive: a freshentry_id, or a rejection naming the failed precondition. - For
read: a (possibly empty) ordered sequence of entries for the named chain, insequence_numberascending order. Each entry carries itsentry_id,sequence_number,event_type,custodian_ref(or bothfrom_custodian_refandto_custodian_reffor transfers),recorded_at, and event-type-specific fields. The current-custodian projection is derivable from the entries: it is theto_custodian_refof the latest transfer entry, or thecustodian_refof the genesis entry if no transfer has occurred. - Rejected actions produce an observable refusal naming the failed precondition.
State
Chain state. A Provenance chain occupies exactly one of two states:
- Open — the chain is active; entries may be appended; the chain has exactly one current custodian.
- Archived — the chain is at terminal disposition; no further entries are accepted. Archived is absorbing.
Chain-level fields:
chain_id— opaque, immutable, system-generated. Set onoriginate. Never changes.artifact_ref— opaque, immutable. Set onoriginate. Never changes.chain_state— either Open or Archived. Begins at Open; transitions to Archived onarchive. No further transitions.current_custodian— the opaque reference of the current custodian. Set onoriginateto the genesiscustodian_ref; updated on each successfultransfertoto_custodian_ref. Never null while the chain is Open or Archived.current_custodianis a derived projection of the entry chain — equal to theto_custodian_refof the latesttransferredentry, or the genesiscustodian_refif no transfer has occurred — maintained as cached chain state so the action guards can be evaluated without replaying the whole chain. The entry chain is the authoritative source: if the cachedcurrent_custodianever disagrees with the value obtained by replaying entries insequence_numberorder, the replayed value governs and the discrepancy is itself a conformance failure (see Generation acceptance, check 3). No invariant depends on the cache being correct; every invariant is stated over the entry chain.next_sequence_number— a strictly increasing integer. Begins at 1; increments by 1 on each successfully written entry. Part of the chain’s persistent state; must survive restarts.
Entry-level fields (all entries):
entry_id— opaque, immutable, system-generated. Set on append. Never reused within the chain.sequence_number— strictly increasing integer assigned fromnext_sequence_numberat append. The order source.event_type— one of{originated, received, transferred, transformed, disclosed, archived}. Set on append. Never changes.custodian_ref— the custodian who performed or is affected by this event (see note on transfers below). Non-empty. Set on append. Never changes.recorded_at— wall-time when the entry was appended. Best-effort annotation; not the order source.
Additional entry fields by event_type:
originatedorreceived(genesis entries): optionalmetadata(opaque). The genesis entry’sevent_typeis itselforiginatedorreceived; thegenesis_typeargument tooriginatesimply selects which. There is no separate storedgenesis_typefield — it would duplicateevent_type— sogenesis_typeis an input name only, not an entry field.transferred: two custodian fields —from_custodian_ref(the outgoing custodian, read fromcurrent_custodianat transition time — never supplied by the caller) andto_custodian_ref(the incoming custodian, supplied by the caller). After this entry,current_custodianbecomesto_custodian_ref.transformed:transformation_descriptor(an opaque non-empty description of what was done).disclosed:recipient_ref(opaque reference to the party to whom a view or copy was disclosed; custody is NOT transferred).archived: no additional fields beyond the common set.
Transitions:
originate(artifact_ref, custodian_ref, genesis_type, [metadata])→ a new chain enters Open with a genesis entry (sequence_number = 1).current_custodian = custodian_ref.next_sequence_numberbecomes 2. Returnschain_id.transfer(chain_id, to_custodian_ref)→ an Open chain receives atransferredentry.from_custodian_ref = current_custodian(from chain state).current_custodianbecomesto_custodian_ref.sequence_numberincrements. Returnsentry_id.transform(chain_id, custodian_ref, transformation_descriptor)→ an Open chain (wherecustodian_ref == current_custodian) receives atransformedentry.current_custodiandoes not change. Returnsentry_id.disclose(chain_id, custodian_ref, recipient_ref)→ an Open chain (wherecustodian_ref == current_custodian) receives adisclosedentry.current_custodiandoes not change. Returnsentry_id.archive(chain_id, custodian_ref)→ an Open chain (wherecustodian_ref == current_custodian) receives anarchivedentry and transitions to Archived.current_custodiandoes not change (the archiving custodian is the chain’s last-recorded holder). Returnsentry_id.
Flow
- Artifact enters the system. The host system calls
originate(artifact_ref, custodian_ref, genesis_type). The atom opens a chain with a genesis entry (event_type =originatedorreceived), setscurrent_custodian = custodian_ref, assignschain_id. Returnschain_id. - Artifact is transformed. The current custodian records a transformation:
transform(chain_id, custodian_ref, transformation_descriptor). The atom appends atransformedentry.current_custodianis unchanged. - Artifact is disclosed to an external party. The current custodian records a disclosure:
disclose(chain_id, custodian_ref, recipient_ref). The atom appends adisclosedentry.current_custodianis unchanged; the recipient is not a new custodian. - Artifact changes hands. The host system calls
transfer(chain_id, to_custodian_ref). The atom readsfrom_custodian_ref = current_custodianfrom chain state, appends atransferredentry, and updatescurrent_custodian = to_custodian_ref. The caller supplies only the incoming custodian; the outgoing custodian is read from the chain’s own state — the hand-to-hand discipline that makes custody continuity structurally enforced rather than procedurally hoped. - Artifact reaches terminal disposition. The current custodian calls
archive(chain_id, custodian_ref). The atom appends anarchivedentry and transitions the chain to Archived. No further entries are accepted. - Query. Any party calls
read(chain_id). The atom returns all entries insequence_numberascending order. The current-custodian projection is theto_custodian_refof the latesttransferredentry, or the genesiscustodian_refif no transfer has yet occurred.
A simpler lifecycle — originate, disclose, archive, no transfer or transformation — is also valid. The chain must have a genesis entry and may have zero or more intermediate entries before archive.
Decision points
-
At
originate(artifact_ref, custodian_ref, genesis_type, [metadata])—artifact_refandcustodian_refmust each contain at least one non-whitespace character; otherwiseinvalid-ref.genesis_typemust be exactly one of{originated, received}; otherwiseinvalid-genesis-type. If the chain creation or genesis entry write fails after all preconditions are satisfied, the atom returnsrejected(storage-failure)— no chain is created, nochain_idis returned, and the caller must treat the rejection as definitive. The atom does not validateartifact_refagainst an external artifact registry; the host system is responsible for ensuring the reference is meaningful. -
At
transfer(chain_id, to_custodian_ref)—chain_idmust reference a known chain; otherwisenot-known. The chain must be in Open state; otherwisearchived.to_custodian_refmust contain at least one non-whitespace character; otherwiseinvalid-ref. Thefrom_custodian_refis read fromcurrent_custodianin chain state — the caller does not supply it, and no check against a caller-supplied outgoing custodian is needed or performed. If the entry write andcurrent_custodianupdate fail after all preconditions are satisfied, the atom returnsrejected(storage-failure). The entry write and thecurrent_custodianupdate are jointly atomic: either both land or neither is visible. Astorage-failureresponse guarantees the chain state is unchanged —current_custodianretains its prior value. -
At
transform(chain_id, custodian_ref, transformation_descriptor)—chain_idmust reference a known chain; otherwisenot-known. The chain must be in Open state; otherwisearchived.custodian_refmust equalcurrent_custodianin chain state; otherwisenot-current-custodian.custodian_refmust contain at least one non-whitespace character; otherwiseinvalid-ref.transformation_descriptormust contain at least one non-whitespace character; otherwiseinvalid-descriptor(a descriptor is not a reference, so it carries its own rejection reason). If the entry write fails, the atom returnsrejected(storage-failure)and the chain state is unchanged. -
At
disclose(chain_id, custodian_ref, recipient_ref)—chain_idmust reference a known chain; otherwisenot-known. The chain must be in Open state; otherwisearchived.custodian_refmust equalcurrent_custodian; otherwisenot-current-custodian.custodian_refandrecipient_refmust each contain at least one non-whitespace character; otherwiseinvalid-ref. If the entry write fails, the atom returnsrejected(storage-failure)and the chain state is unchanged. -
At
archive(chain_id, custodian_ref)—chain_idmust reference a known chain; otherwisenot-known. The chain must be in Open state; otherwisealready-archived.custodian_refmust equalcurrent_custodian; otherwisenot-current-custodian. If the entry write and chain state update to Archived fail, the atom returnsrejected(storage-failure)and the chain remains in Open state. Thearchiveentry write and the chain-state transition are jointly atomic. -
At
read(chain_id, [query])—chain_idmust reference a known chain; otherwisenot-known. Optional query parameters (sequence-number range, time range, event_type filter) must be well-formed: ranges must have start ≤ end; event_type values must be from{originated, received, transferred, transformed, disclosed, archived}. An ill-formed parameter returnsrejected(invalid-query). A well-formed query matching no entries returns an empty sequence, not a rejection.
Rejection priority. When multiple precondition violations exist on the same call, the rejection returned follows a defined priority — cheapest and most-structural checks first, persistence last. For originate: invalid-ref / invalid-genesis-type → storage-failure. For transfer: not-known → archived → invalid-ref → storage-failure. For transform: not-known → archived → not-current-custodian → invalid-ref → invalid-descriptor → storage-failure. For disclose: not-known → archived → not-current-custodian → invalid-ref → storage-failure. For archive: not-known → already-archived → not-current-custodian → storage-failure. For read: not-known → invalid-query. A caller fixing one rejection class may receive a different rejection on retry; this is expected and not a regression.
Behavior
- Entries are durable on success. Once the caller receives an
entry_id(or achain_idfromoriginate), the entry is in the chain and will appear in subsequent reads. - The chain is append-only. Entries accumulate; no entry is removed or altered after it is written. Chain state (Open/Archived) changes; entry fields never do.
- Custody is continuous. At every point from genesis to archive,
current_custodianis set, non-null, and names exactly one custodian. It is impossible for the chain to be in a state where custody is held by nobody or by two parties simultaneously. There is no action that setscurrent_custodianto null, no action that accepts custody transfers without immediately designating a new holder, and no action other thantransferthat changescurrent_custodian. - Transfers are hand-to-hand. The
transferaction readsfrom_custodian_reffrom chain state; the caller supplies onlyto_custodian_ref. This structural choice is load-bearing: it makes it impossible for atransferentry to record a falsefrom_custodian_ref, because the chain’s own state is the authoritative source of who the outgoing custodian is. A caller who is not the current custodian cannot manufacture a transfer from a prior holder; they can only calltransferwhich will correctly attribute the outgoing side to whoevercurrent_custodiannames at that moment. - Only the current custodian can transform, disclose, or archive. The
not-current-custodianguard ontransform,disclose, andarchiveenforces that only the named holder of record can record actions that implicate the artifact while under their custody. A prior custodian who transferred the artifact away has no further write authority over this chain. - Archived is terminal and absorbing. Once the chain enters Archived, no action will append to it. The chain remains readable indefinitely; it simply accepts no new entries.
- Wall-time is best-effort.
recorded_atis the wall-time clock at the receiving system when the entry is written. Under an unreliable or adversarial clock,recorded_atmay not be monotonic;sequence_numberis the authoritative order source. Callers and auditors must usesequence_numberfor ordering, notrecorded_at. - Medium-agnostic.
artifact_refis opaque. The chain is structurally identical whether it tracks a physical pharmaceutical vial, a digital file, a legal document, or a forensic specimen. The host system decides what the reference means; the atom enforces custody continuity regardless. - Reads do not modify state.
readis a pure query. It returns entries insequence_numberascending order and leaves the chain unchanged.
Feedback
- After
originate— a new chain is Open in the chain store, with a genesis entry atsequence_number = 1.current_custodianis set to the genesis custodian.chain_idis returned. Chain count and total entry count each increase. - After
transfer— a newtransferredentry exists in the chain.current_custodianis updated toto_custodian_ref.sequence_numberof the new entry is strictly greater than the previous entry’s.entry_idis returned. - After
transform— a newtransformedentry exists in the chain.current_custodianis unchanged.entry_idis returned. - After
disclose— a newdisclosedentry exists in the chain.current_custodianis unchanged.entry_idis returned. - After
archive— a newarchivedentry exists in the chain. The chain transitions to Archived.current_custodianholds the archiving custodian and does not change further.entry_idis returned. No further entries are accepted. - After a rejected action — an observable refusal with a named reason. No chain state or entry state changes.
Each observable action produces a countable effect: entry count increases; current_custodian changes only on transfer; chain state changes only on archive.
Invariants
The following invariants (conditions that must always hold, regardless of what sequence of actions has occurred) constitute the verification surface of the pattern:
Invariant 1 — Entry immutability. Once an entry is recorded in the chain, its entry_id, sequence_number, event_type, custodian_ref (or from_custodian_ref/to_custodian_ref for transfers), recorded_at, and all event-type-specific fields never change.
Invariant 2 — Append-only chain. No entry is ever removed from the chain and no entry is reordered. The entry set for any chain is monotonically non-decreasing for the lifetime of the chain. The atom provides no deletion or reorder surface.
Invariant 3 — Single origin. Every chain has exactly one genesis entry (event_type originated or received), and it occupies sequence_number = 1 — the minimum in the chain. No re-origination is possible; originate is only available for a new chain. Every non-genesis entry has a predecessor (a prior entry with a strictly smaller sequence_number).
Invariant 4 — Custody continuity (no gap). This is the load-bearing invariant. At every point from genesis until archive:
- The chain has exactly one current custodian (
current_custodianis non-null and non-empty). - Every
transform,disclose, andarchiveentry is attributed to the then-current custodian — i.e., thecustodian_refon such an entry equals thecurrent_custodianthat was in effect when the entry was written. - Every
transferredentry recordsfrom_custodian_refequal to thecurrent_custodianimmediately prior to that entry, andto_custodian_refbecomes the newcurrent_custodianimmediately after that entry. - There is no state reachable from any sequence of valid actions in which custody is held by nobody or by two parties simultaneously.
Event Log (see atoms/event-log.md) cannot express this invariant. Event Log is a content-agnostic stream with no subject and no custodian; it permits sequence gaps by design. Invariant 4 is what makes Provenance a distinct freestanding concept rather than a configured Event Log.
Invariant 5 — Total order within chain. For any two distinct entries e1 and e2 in the same chain, exactly one of e1.sequence_number < e2.sequence_number or e1.sequence_number > e2.sequence_number holds. sequence_number is the authoritative order source; recorded_at is best-effort only and may not be monotonic under unreliable clocks.
Invariant 6 — Archived is terminal and absorbing. Once a chain enters Archived, no entry is accepted. The archive, transfer, transform, and disclose actions all reject against an Archived chain. The chain is readable; it simply admits no new entries. The Archived state is permanent: no action transitions the chain out of Archived.
Invariant 7 — Custodian presence. Every entry carries at least one non-empty, non-whitespace custodian_ref. For transferred entries, both from_custodian_ref and to_custodian_ref are non-empty. For all other entry types, the single custodian_ref is non-empty. There is no entry with an anonymous or unattributed custodian.
Invariant 8 — Event type validity. Every entry’s event_type is exactly one of {originated, received, transferred, transformed, disclosed, archived}. Every genesis entry’s event_type is originated or received, and no non-genesis entry carries event_type originated or received. The final entry of an Archived chain has event_type = archived. The genesis_type argument to originate is exactly one of {originated, received} and determines the genesis entry’s event_type; it is not stored as a field distinct from event_type.
Invariant 9 — No id reuse. No two entries in the same chain share an entry_id. No two chains in the same store share a chain_id. Once assigned, these identifiers are permanent and stable.
Invariant 10 — Chain and store durability. Chain records and entry records are never deleted from the store. The total chain count is monotonically non-decreasing; the total entry count is monotonically non-decreasing. The next_sequence_number counter is part of the chain’s persistent state and must survive restarts; a volatile implementation that resets the counter on restart violates Invariant 5 across the lifetime of the chain. A storage-failure rejection guarantees no partial record is observable: the action either makes all its required writes durable (including current_custodian updates for transfer and chain-state updates for archive) or has no observable effect on the chain.
Examples
Pharmaceutical — drug sample chain of custody
A pharmaceutical manufacturer originates a batch: originate(artifact_ref: "batch-x91", custodian_ref: "manuf-lab-7", genesis_type: originated) → chain_id: "chain-0041". The manufacturer ships to a regional distributor: transfer(chain_id: "chain-0041", to_custodian_ref: "dist-region-3") → entry_id: "e2". The entry records from_custodian_ref: "manuf-lab-7" (read from chain state) and to_custodian_ref: "dist-region-3". The distributor stores, then ships to a hospital pharmacy: transfer("chain-0041", "pharm-hosp-9") → entry_id: "e3". The pharmacist dispenses to the dispensing cart: transform("chain-0041", "pharm-hosp-9", "dispensed 10mg dose into dispensing unit D44") → entry_id: "e4". The pharmacist archives the chain after the dose is administered: archive("chain-0041", "pharm-hosp-9") → entry_id: "e5". Chain is now Archived.
A regulator asks: “Prove unbroken custody of batch-x91 from manufacture to dispensing.” read("chain-0041") → five entries in sequence_number order. The from_custodian_ref fields on every transferred entry equal the to_custodian_ref of the immediately preceding entry (or the genesis custodian_ref for the first transfer). Invariant 4 provides the structural answer.
Legal evidence — physical exhibit
A detective collects a physical exhibit at the crime scene: originate(artifact_ref: "exhibit-A", custodian_ref: "det-r.james", genesis_type: originated) → chain_id: "chain-0107". The detective delivers the exhibit to the evidence room: transfer("chain-0107", "evid-room-pd") → entry_id: "e2". The evidence room sends to the forensic lab: transfer("chain-0107", "forensic-lab-12") → entry_id: "e3". The lab documents the analysis: transform("chain-0107", "forensic-lab-12", "fingerprint-lifted; DNA-sample-taken; original-exhibit-intact") → entry_id: "e4". The exhibit is transferred back to the evidence room: transfer("chain-0107", "evid-room-pd") → entry_id: "e5". The chain remains Open pending trial.
At trial, defense counsel challenges: “Can you prove the exhibit was not tampered with between the detective and the lab?” read("chain-0107") → five entries. Every transferred entry’s from_custodian_ref matches the prior holder; the transformed entry is attributed to forensic-lab-12, who held it at that time (Invariant 4). Defense counsel’s claim — that someone outside the chain handled the exhibit — has no structural basis in the records.
Rejection paths
Attempt to transform when not current custodian. After the pharmaceutical transfer above, the original manufacturer attempts to record a transformation: transform(chain_id: "chain-0041", custodian_ref: "manuf-lab-7", transformation_descriptor: "added label update") → rejected(not-current-custodian). current_custodian is pharm-hosp-9; manuf-lab-7 no longer holds the artifact. No entry is written.
Attempt to append to an Archived chain. After the pharmaceutical chain is archived, a downstream system attempts another transfer: transfer(chain_id: "chain-0041", to_custodian_ref: "disposal-unit-1") → rejected(archived). No entry is written; the chain remains in its terminal state.
Attempt to originate with an empty custodian_ref. A host system calls originate(artifact_ref: "sample-99", custodian_ref: "", genesis_type: originated) → rejected(invalid-ref). No chain is created; no chain_id is returned.
Attempt to read a non-existent chain. A query arrives for a chain_id that was never issued: read("chain-9999") → rejected(not-known). The atom has no chain under that id.
Regulated adversarial scenarios
Three scenarios the atom must survive in regulated contexts:
Regulator audit — prove unbroken custody of pharmaceutical sample
A pharmaceutical regulator (FDA — US Food and Drug Administration — the federal agency regulating drugs and medical devices) audits an inspected facility and asks: “Produce the complete chain of custody for sample batch-x91, and prove that custody was unbroken from manufacture to dispensing under 21 CFR (Code of Federal Regulations — the codification of US federal agency rules) Part 211.” The auditor calls read("chain-0041") and receives the complete ordered entry sequence. The auditor verifies: (a) exactly one genesis entry at sequence_number = 1; (b) for every transferred entry, from_custodian_ref equals the to_custodian_ref of the immediately preceding entry (or the genesis custodian_ref); (c) every transform and disclose entry’s custodian_ref equals the current_custodian in effect at that point in the sequence; (d) the final entry is archived. All four conditions hold by Invariant 4 (custody continuity). The auditor’s answer comes from the records alone — not from the facility’s assertion that custody was maintained.
Disputed transaction — defense claims the artifact passed through an unrecorded handler
Defense counsel in a criminal trial claims that a piece of physical evidence was handled by an undocumented party between the forensic lab and the evidence room, and that the chain was therefore broken. The investigator provides the complete entry sequence from read("chain-0107"). For every transferred entry, from_custodian_ref equals the prior holder; no entry’s from_custodian_ref names a party who was not the immediately prior to_custodian_ref. This structural rebuttal rests on two invariants working together: Invariant 4 (custody continuity — every transfer’s from_custodian_ref is read from chain state and cannot be forged) and Invariant 7 (custodian presence — every entry names a non-empty custodian). Defense counsel cannot point to a gap in the sequence, because the chain’s structure makes a gap impossible: if no entry records a transfer to a hypothetical intermediary, then no intermediary was ever the current custodian, and no intermediary could have recorded a transformation or generated a subsequent transfer.
Breach or incident investigation — reconstruct who held the artifact during an anomaly window
An internal investigation suspects that a controlled substance was improperly handled between two dates. The investigator calls read("chain-0041", {recorded_at_range: ["2026-03-01", "2026-03-15"]}) to retrieve entries within the anomaly window. The ordered sequence of entries, with sequence_number as the authoritative order source, reveals: who held the artifact (current_custodian derivable from each point), what transformations were recorded, and to whom disclosures were made. The investigator can determine whether any entry during the window carries an unexpected custodian, an unexplained transformation, or a disclosure to an unauthorized recipient. Invariant 5 (total order) guarantees the sequence can be replayed unambiguously; Invariant 4 (custody continuity) means the reconstructed custodian state at any point in the window is exactly determined by reading the preceding entries — there is no ambiguity about who held the artifact at any moment.
Generation acceptance
A derived implementation of Provenance is acceptable — in the regulator-acceptance sense — when an external auditor, given the chain store and its entries, can do all of the following without recourse to source code, runbooks, or developer narration:
-
Verify every entry is custodian-attributed. For every entry in every chain, confirm that at least one non-empty
custodian_refis present (bothfrom_custodian_refandto_custodian_reffortransferredentries). An entry with an empty or absent custodian is a conformance failure (Invariant 7). -
Verify single-origin and genesis placement. For every chain, confirm that exactly one entry has
sequence_number = 1andevent_type ∈ {originated, received}. No chain has two genesis entries; no genesis entry has asequence_numbergreater than 1. A chain with no genesis entry or with more than one genesis entry is a conformance failure (Invariant 3). -
Verify custody continuity. Replay the entries in
sequence_numberascending order and maintain a runningcurrent_custodiancursor. For everytransferredentry, confirm that the recordedfrom_custodian_refequals the cursor’s current value before the entry; update the cursor toto_custodian_ref. For everytransform,disclose, andarchiveentry, confirm that the entry’scustodian_refequals the cursor’s current value at that point. A discrepancy at any entry is a conformance failure (Invariant 4). -
Verify chain order is reconstructable from sequence_number alone. Sort the entries for a chain by
sequence_numberand confirm the result is a strictly increasing sequence with no gaps and no duplicates. The auditor does not rely onrecorded_atfor order. A non-strictly-increasingsequence_numbersequence is a conformance failure (Invariant 5). -
Verify archived chains accept no later entries. For every chain in Archived state, confirm that no entry has a
sequence_numberorrecorded_atlater than thearchivedentry. An entry in an Archived chain after itsarchivedmarker is a conformance failure (Invariant 6). -
Verify no entry mutated. For a set of known
entry_ids (orchain_ids) previously retrieved, re-query and confirm that all fields match the prior values exactly. A field that changes between reads is a conformance failure (Invariant 1).
This is the generator’s contract: any code generated from this atom must produce chains and a read surface that pass all six checks. The bar is the regulator’s question — “can you prove unbroken custody of this artifact from origin to disposition, from the records alone?” — answered structurally, not procedurally.
Edge cases and explicit non-goals
-
Non-repudiable custodian identity. The atom records
custodian_refvalues as opaque references and enforces structural continuity — but it does not verify that the suppliedcustodian_refis a real, credentialed party or that the caller is who they claim to be. Non-repudiable identity attestation — a verifiable binding of acustodian_refto a real actor’s credential — composes with Actor Identity. Without that composition, the chain records structural continuity; with it, the chain is also attributable in the non-repudiation sense. -
Disclosure scope and authority. Provenance’s
discloserecords only the custody-timeline fact that a disclosure occurred — which custodian disclosed, to which recipient, at which point in the chain. It does not record what subset of the artifact’s data was disclosed or under what authority. That scope-and-authority record belongs to Selective Disclosure — a durable, append-only record of what scope of data was shared, to whom, under what authority, and when. The two compose without overlap: Provenance places the disclosure on the artifact’s custody timeline; Selective Disclosure records the disclosure’s content and legal basis. Provenance deliberately does not duplicate the scope/authority surface, so thediscloseevent carries onlyrecipient_refand the custody-position fields — a deployment needing disclosure-scope accounting composes the two atoms. -
Pre-genesis external custody (
received). When a chain opens withreceivedgenesis type, the artifact had a custody history outside this system before intake. The chain documents custody only from the genesis (intake) entry forward; it makes no claim about — and provides no record of — custody before genesis. Custody continuity (Invariant 4) is a guarantee from genesis onward, not an assertion that nothing happened to the artifact beforehand. Pre-intake provenance, where required, is a separate chain or an external record the host links via the genesismetadata. -
Cryptographic tamper-evidence on the chain. The atom guarantees immutability and continuity by specification; it does not prevent an adversary with write access to the underlying store from rewriting entries. Cryptographic hash chaining, Merkle tree commitment, or external timestamping belonging to Tamper Evidence. SEC Rule 17a-4 (requiring records to be preserved in a non-rewriteable, non-erasable format) is satisfied at the chain-store layer; Tamper Evidence provides the audit proof.
-
Retention and defensible disposal of the chain. How long the chain must be kept and how it may be destroyed are governed by Retention Window and Defensible Retention. The atom retains all chains and entries indefinitely from its own perspective; retention policy is the composing concept.
-
The full chain-of-custody surface. The complete chain-of-custody guarantee — attribution (non-repudiable custodians via Actor Identity), structural continuity (this atom), tamper-evidence (Tamper Evidence), and retention (Retention Window / Defensible Retention) — is the Chain of Custody composition (C12). Provenance is the core primitive C12 composes.
-
DAG-style derivation and artifact splitting. W3C PROV’s (Provenance Data Model — W3C’s directed-acyclic-graph representation of provenance with
wasDerivedFrom,wasGeneratedBy, andusedrelationships)wasDerivedFromrelationship — where one artifact is produced by transforming or combining others — is an explicit non-goal. This atom is a linear single-artifact chain; branching and convergence are out of scope. A pharmaceutical sample aliquoted into five sub-samples would require five new chains, each originating withreceivedgenesis type, each referencing its ownartifact_ref; the relationship between the parent chain and the child chains is a composing concept outside this atom. The linearity constraint is what makes custody continuity well-defined; a DAG model does not have a singlecurrent_custodian. -
Multi-party simultaneous custody. Dual-custody (where two parties must jointly hold the artifact), escrow, and similar shared-custody arrangements are out of scope. The atom’s state machine has exactly one
current_custodianat all times. Composing patterns that require multi-party custody gates (e.g., dual-control access to a safe-deposit box) must model joint custody as a composition above this atom. -
Physical vs. digital medium. The atom is medium-agnostic;
artifact_refis opaque. Whether the tracked entity is a physical vial, a digital file, a signed document, or a forensic sample belongs to the host system. -
Concurrent transfer attempts. Two callers simultaneously attempting to
transferthe same chain must be serialized by the underlying implementation. The first transfer wins; the second will observe a changedcurrent_custodianand should either succeed (if the new custodian is the intended recipient) or fail (if the race was unintentional). The atom does not provide a compare-and-swap surface for conditional transfer; host systems that need optimistic-concurrency transfer semantics must compose with a Transaction or Idempotent Reservation pattern. -
Transfer
fromfield supplied by caller. Thetransferaction deliberately does not accept afrom_custodian_refparameter. The outgoing custodian is always read fromcurrent_custodianin chain state. A caller who supplies their ownfrom_custodian_ref— perhaps to pre-validate a transfer before executing it — should read the chain’s current state first. The structural constraint is load-bearing: allowing a caller-suppliedfrom_custodian_refwould open the door to a false predecessor attack (recording a transfer as coming from a party who was not the current custodian). The hand-to-hand guarantee closes this attack surface structurally. -
Atomic writes for transfer and archive. The
transferaction requires two durable writes: the new entry and thecurrent_custodianupdate. Thearchiveaction requires two durable writes: thearchivedentry and the chain-state update to Archived. A crash between writes leaves the store in an inconsistent state — an entry without a corresponding state update, or a state update without a corresponding entry. Resolving mid-transition crashes is out of scope for this atom; implementations must provide atomic transaction support across both writes, or a crash-recovery scan that detects and repairs dangling transitions on restart. Per the Decision points, astorage-failureresponse fromtransferorarchiveguarantees no partial write is observable; the chain remains in its prior state. -
Clock semantics.
recorded_atis captured from the implicit clock. Clock skew, timezone handling, daylight-saving transitions, and monotonicity are handled at the deployment layer. For use cases where custodial timestamps have legal force (chain-of-custody timestamps in court proceedings, pharmaceutical distribution records), the implementation must source time from a trustworthy clock; a Trusted Timestamping composition (per RFC 3161 — the Internet standard defining a trusted time-stamping protocol) provides the verifiable time anchor. The atom’s ordering guarantees rest onsequence_number, not onrecorded_at; no invariant is at risk from a bad clock. -
artifact_refvalidation. The atom does not validateartifact_refagainst an external artifact registry; it only requires that the reference be non-empty. Whether a givenartifact_refnames a real, active artifact in the host system is the host system’s responsibility. -
Empty transformation_descriptor. A
transformation_descriptorthat consists solely of whitespace is treated as empty and returnsrejected(invalid-descriptor)— a dedicated reason distinct frominvalid-ref, because a descriptor is content, not a reference. An opaque transformation that cannot be described at all should be treated as a gap in the chain’s description — not a valid transformation entry. Implementations must check for visible content before accepting.
Composition notes
Provenance is freestanding and is the single-artifact custody primitive that several composing patterns build on:
-
Chain of Custody (C12) — the primary composition naming Provenance. Chain of Custody wires Provenance + Actor Identity + Tamper Evidence + Retention Window to produce the full attribution-verified, cryptographically-sealed, retention-governed chain-of-custody surface. Provenance is the structural core; the three compliance atoms supply the per-entry attribution attestation, the chain tamper seal, and the retention clock. This composition is the canonical implementation of pharmaceutical chain of custody (21 CFR Part 211 / DEA 21 CFR Part 1304), regulated evidence custody (Federal Rules of Evidence 901(b)(9)), and financial instrument custody records (SEC Rule 17a-4). Chain of Custody (C12) is
grounded(2026-06-04); its grounding resolved the forthcoming-link formerly carried in this Composition notes section. -
Immutable Transaction Ledger (C6) —
grounded(2026-06-08). Provenance enriches C6 for ledger entries that represent tracked artifacts; chain-of-custody guarantees on ledger entries compose naturally where theartifact_refreferences a financial instrument. C6 names this enrichment in its Single-artifact financial-instrument custody edge case. -
Data Subject Rights Fulfillment (C7) — a composing peer (not a constituent): where artifacts carry personal data, Provenance’s chain-of-custody record is a composing input for demonstrating lawful handling under GDPR (EU General Data Protection Regulation — Europe’s data-privacy law) Article 5 data-minimization and Article 30 records-of-processing-activity requirements.
-
KYC / Customer Onboarding (C8) — chain-of-custody guarantees on identity-verification documents (passports, utility bills, biometric records) are a natural Provenance use case for KYC (Know Your Customer — the verification process for establishing customer identity under anti-money-laundering regulations) workflows; Provenance optionally enriches C8’s record surface with document-custody chains.
-
Actor Identity — supplies non-repudiable attestation for
custodian_refvalues. Without Actor Identity, the chain records structural continuity but not verifiable custodian identity; with it, eachcustodian_refhas a binding proof. -
Tamper Evidence — cryptographically seals the entry chain so any rewrite is detectable from the records alone.
-
Retention Window — governs the minimum and maximum retention period for the chain under applicable regulatory obligations.
-
Selective Disclosure — records the scope and legal authority of each disclosure. Provenance’s
discloseevent marks where on the custody timeline a disclosure occurred and to whom; Selective Disclosure records what scope was shared and under what authority. The two compose for full disclosure accounting without either duplicating the other.
Standards references
Provenance is an infrastructure primitive with regulatory anchoring across pharmaceutical, legal, and financial domains:
-
ISO 23081 (Information and documentation — Managing metadata for records) — the International Organization for Standardization’s standard on records-management metadata. Provenance is a required element in ISO 23081-compliant records; the atom’s chain-of-custody entries map directly to the origin, transfer, and transformation metadata elements ISO 23081 specifies.
-
W3C PROV (Provenance Data Model — W3C’s RDF-based standard for representing provenance) — the atom models the linear single-artifact custody slice of PROV’s entity/activity/agent framework. PROV expresses a DAG of provenance relationships; this atom is the linear spine of a single-entity PROV graph — the
wasGeneratedBy,used, andwasAttributedTorelationships along a single entity’s chain. The deliberate non-goal ofwasDerivedFrom(DAG derivation) and artifact splitting (one entity split into several) are both out-of-scope relative to the full PROV model. -
FDA 21 CFR Part 211 (Current Good Manufacturing Practice — Finished Pharmaceuticals) — US pharmaceutical manufacturing regulations requiring a chain-of-custody record for drug substances and products from manufacture through distribution. The atom’s genesis + transfer + transformation + archive lifecycle is the operational form of Part 211’s custodial recording requirements.
-
DEA 21 CFR Part 1304 (Controlled Substance Inventory Records) — US Drug Enforcement Administration (DEA) regulations requiring complete, accurate records of the disposition of controlled substances, including every change of custody. Invariant 4 (custody continuity) is the structural implementation of this requirement.
-
SEC Rule 17a-4 (Records to be preserved by certain exchange members, brokers, and dealers) — US Securities and Exchange Commission rule requiring records to be preserved as originally created, in a non-rewriteable, non-erasable format. The atom’s append-only, entry-immutable chain (Invariants 1 and 2) is the structural form of the preservation-as-originally-created requirement. Tamper Evidence composes to supply the WORM (write-once, read-many) store seal.
-
Federal Rules of Evidence 901(b)(9) (Authenticating or Identifying Evidence — Process or System) — the US evidentiary rule for authenticating physical or electronic evidence via chain-of-custody records. The atom’s custody-continuity invariant is the structural basis for authenticating evidence under 901(b)(9): a chain whose
from_custodian_refvalues match the priorcurrent_custodianat every transfer step produces the unbroken sequence courts require for authentication.
The cross-domain structural identity is the atom’s core thesis: the pharmaceutical chain of custody, the legal evidence chain, the financial instrument custody record, and the DEA controlled-substance custody log are all instances of the same primitive — one artifact, one current custodian, append-only entries, custody never gaps. This atom is the core of the Chain of Custody composition (C12), grounded 2026-06-04.
It inherits from:
- Daniel Jackson, The Essence of Software — the freestanding-atom posture; the discipline of composing identity attestation, tamper-evidence, and retention as separate concepts rather than absorbing them.
- Eiffel’s design-by-contract — preconditions on every action, every rejection reason named.
- Linear temporal logic — custody continuity (Invariant 4) and archived-is-terminal (Invariant 6) expressed as temporal properties holding across every reachable state.
Status
grounded on Final Critique 4 — 2026-06-04 (formal layer complete 2026-06-04 — Alloy model provenance.als + buggy twin verified in tools/harness/; see Lineage §Formal model). Sonnet-drafted against an Opus plan, then Opus-gated through Pass 1 (GRID structural), Pass 2 (EOS conceptual independence — the load-bearing boundary against Event Log, plus the disclose-vs-Selective-Disclosure boundary), Pass 3 (Linus adversarial), and a Final Critique round: two foundational findings and four refining findings, all closed in-pattern (see Lineage notes). Regulated-pattern conventions (Regulated adversarial scenarios; Generation acceptance) baked in from the first draft, inherited from the methodology directly per pressure-testing.md §Regulated-pattern conventions. The formal-layer vote was YES; the derived Alloy model (custody continuity / single-origin / archived-absorbing on a linear chain, mirroring clinical-observation.als) verifies green — twelve checks hold, five non-vacuity runs satisfiable — with a buggy twin the checker rejects on five checks, clearing the “model present” bar. The English cleared the 92%-good threshold (foundational findings at zero) and the formal layer is discharged, so the pattern is unqualified grounded. Under the unified methodology (3×3 baseline rounds with per-round Pass 1/2/3 numbering + Final Critique starting at Round 4), this pattern’s Opus-led gating review (Pass 1/2/3 + Final Critique round) is retro-labeled Final Critique 4; the original round-naming in the Lineage notes below is preserved as historical record.
Lineage notes
EOS Pass-2 boundary — conceptual independence against Event Log. The key conceptual-independence record for this atom: Provenance is NOT a configured or specialized Event Log. Event Log (see atoms/event-log.md) is a content-agnostic stream with no subject, no custodian, no continuity guarantee, and an explicitly-permitted sequence gap (a storage-failure consumes a sequence number; the next successful append receives a strictly higher number). The load-bearing distinction: Provenance maintains a current_custodian in chain state, updated atomically with every transfer entry, and every transform, disclose, and archive action is guarded against the then-current custodian. This state and these guards are properties of a subject-scoped chain; they cannot be expressed as invariants of a content-agnostic stream. The transfer action’s reading of from_custodian_ref from chain state (rather than accepting it as a caller-supplied argument) is the structural mechanism that makes custody continuity an invariant rather than a convention — a convention Event Log could encourage but not enforce. The EOS test: does this concern have its own state machine? Yes: Open → Archived, with current_custodian as a state variable that changes only on transfer. Does it recur across many domains? Yes: pharmaceuticals, legal evidence, financial instruments, controlled substances, digital files. Could the host concept be specified without this concern, with the concern composed in? No — custody continuity is definitional to Provenance, not an optional additive. Verdict: freestanding atom, not a wrapper around Event Log.
Seven further concerns named as composing patterns from the first draft. None absorbed in-pattern:
- Non-repudiable custodian identity → Actor Identity
- Cryptographic tamper-evidence on the chain → Tamper Evidence
- Retention and defensible disposal → Retention Window / Defensible Retention
- The full attribution+tamper+retention chain-of-custody surface → Chain of Custody (C12) (
grounded2026-06-04) - DAG-style derivation / artifact splitting → explicit non-goal (linear chain by design)
- Multi-party simultaneous custody → explicit non-goal (single current custodian by design)
- Trusted timestamping for custodial timestamps → Trusted Timestamping (forthcoming)
Conventions inherited from prior work. The Regulated adversarial scenarios and Generation acceptance sections are inherited from the methodology directly per pressure-testing.md §Regulated-pattern conventions and spec-format.md §Regulated overlay. Not re-derived from predecessor atoms.
Formal-layer vote — 2026-06-04: YES (model pending). Invariant 4 (custody continuity — every transfer’s from_custodian_ref equals the prior current_custodian; every non-transfer action attributed to the then-current holder), Invariant 3 (single origin — exactly one genesis entry at sequence_number = 1), and Invariant 6 (archived-is-terminal — no entry accepted after the archived marker) are structural/relational claims on a chain with ordered entries and a tracked current-custodian cursor. These are Alloy-class properties: the load-bearing claims are relational (the from_custodian_ref of a transfer equals the to_custodian_ref of the prior transfer, or the genesis custodian_ref; the state after a sequence of entries has a uniquely determined current_custodian), and a linear-chain model with a custodian-cursor and hand-to-hand transfer semantics is well-matched to Alloy’s bounded exhaustive search over structural states. The model mirrors clinical-observation.als (linear chain, successor/predecessor relation, linear-chain fact). Until the model is authored and verifies, the pattern is draft (does not yet carry grounded (English) — formal layer pending because the prose passes themselves are also pending). Vote per pressure-testing.md §Formal models — The formal-layer vote. (Superseded 2026-06-04: the prose passes cleared and the Alloy model landed and verifies — see the two entries immediately below; this entry is preserved as the record of the vote as cast.)
Opus-led gating review — 2026-06-04 (Pass 1 GRID / Pass 2 EOS / Pass 3 Linus + Final Critique). Sonnet drafted against the Opus plan; Opus gated. Two foundational findings and four refining findings, all closed in-pattern (compact format: F-id — name — class → fix):
- F1 —
disclosevs. Selective Disclosure boundary — foundational (Pass 2). The draft’sdiscloseevent overlapped the existing Selective Disclosure atom (which already owns “what scope disclosed, to whom, under what authority”) with no boundary drawn — an EOS over-absorption risk. → Scoped Provenance’sdiscloseto the custody-timeline fact only (custodian, recipient, sequence position); named Selective Disclosure as the composing pattern for disclosure scope/authority in Edge cases and Composition notes. Provenance records no scope/authority field. - F2 —
current_custodianauthoritative source — foundational (Pass 3). The draft treatedcurrent_custodianas primary chain state while also calling it derivable, leaving the authoritative source ambiguous on disagreement (store corruption). →current_custodianis now stated as a derived projection (cache) of the entry chain; the entry chain is authoritative; on disagreement the replayed value governs and the discrepancy is a conformance failure (Generation acceptance check 3). Every invariant is stated over the entry chain, not the cache. - F3 —
genesis_typeredundancy — refining (Pass 3). The draft carried a storedgenesis_typeentry field duplicatingevent_typefor genesis entries. → Removed as a stored field;genesis_typeis anoriginateargument only, selecting the genesis entry’sevent_type. State and Invariant 8 updated. - F4 —
invalid-refmisused fortransformation_descriptor— refining (Pass 3). A whitespacetransformation_descriptorreturnedinvalid-ref, conflating a content field with a reference. → Added a dedicatedinvalid-descriptorrejection ontransform; signature, decision point, rejection priority, and the edge case updated. - F5 — store-instance dimension undescribed — refining (Pass 1 GRID System node). The draft did not describe the named-store-instance multiplicity (multiple chains per store; cross-store uniqueness a composing concern). → Added a Store instances paragraph to the Identity model, mirroring Event Log’s discipline.
- F6 — dangling composition links — refining (Pass 1 reference graph). Composition notes linked C6 (Immutable Transaction Ledger) and C7 (Data Subject Rights Fulfillment) as live links, but those files do not exist yet. → Demoted both to
*(forthcoming)*(no live link); the KYC (C8) and Defensible Retention links remain live (those files exist).
No EOS over-absorption survives: the load-bearing Event Log boundary holds (Invariant 4 is not expressible on a content-agnostic stream), and the disclose/Selective-Disclosure boundary (F1) — the one absorption risk — was extracted. Pass 1 GRID otherwise clean; all nine nodes resolved. The English clears the 92%-good threshold (foundational findings at zero); the Alloy formal model is the remaining grounding prerequisite per the YES vote.
Formal model — 2026-06-04: Alloy authored and verified; pattern promoted to grounded. Derived model provenance.als + buggy twin provenance-buggy.als, checked via tools/harness/check.mjs (Alloy headless). What it checks: the custody chain modeled as a linear entry sequence with a per-entry holder (the current custodian in effect after the entry) and hand-to-hand transfer fields (fromC/toC). Twelve checks, all UNSAT: the linear backbone (Invariants 2/5 — at-most-one successor/predecessor, no branching, acyclic), single-origin typing (Invariant 3 — genesis iff genesis-type), the load-bearing custody continuity cluster (Invariant 4 — A_Inv4_CustodyUnbroken: incoming custody equals the predecessor’s outgoing at every link; A_Inv4_TransferHandToHand: a transfer’s fromC equals the prior holder — the false-predecessor attack surface; A_Inv4_TransferSetsHolder; A_Inv4_OnlyHolderActs: only the current holder may transform/disclose/archive; A_Inv4_UniqueHolder: custody is never held by nobody), archived-is-absorbing (Invariant 6), and custodian presence (Invariant 7). Five non-vacuity runs all SAT (single genesis, received genesis, transfer chain, the full Originated→Transferred→Transformed→Disclosed→Archived lifecycle, two independent chains). Buggy twin: drops the CustodyContinuity and ArchivedTerminal facts — re-introducing the custody-gap / false-predecessor hazard and the append-after-archive hazard. The checker rejects it: A_Inv4_CustodyUnbroken, A_Inv4_TransferHandToHand, A_Inv4_TransferSetsHolder, A_Inv4_OnlyHolderActs, and A_Inv6_ArchivedTerminal all find counterexamples, confirming those checks have teeth. Scope/saturation: scope 8 (mirrors clinical-observation.als); custody continuity is a local per-link relational property, insensitive to scope beyond a few entries — the longest chain in scope 8 is 8 entries, more than enough to exercise transfer→transfer→archive topologies. Conflict-protocol outcome: none — the model corroborates the English; canonical English unchanged. Reproduce: cd tools/harness && node check.mjs ../../atoms/provenance.als (and … provenance-buggy.als --buggy).