1. Crash edge case: If an agent executes a side-effect and dies before signing the receipt, is that action orphaned? Any WAL-style intent/completion model?
2. Multi-step workflows: Do receipts chain natively (parent pointers/Merkle) or via external linking? (I see storage/ledgers are out of scope, but curious about the linkage design.)
The negative proof angle (proving AI didn't touch prod) is compelling for compliance.
Threat model / scope:
This design assumes the signer’s private key is trusted at issuance time; it does not attempt to prove semantic correctness of the agent’s reasoning or inputs. The signature covers only the canonicalized signed_block; any mutation invalidates verification. Receipts are portable and verifiable offline but do not prevent a malicious issuer from signing false data (integrity primitive, not a truth oracle). Replay is detectable (e.g. via hash chaining or external indexing) but not prevented by the receipt alone. Confidentiality is out of scope; receipts are integrity-only artifacts. The goal is to make post-hoc tampering and log forgery detectable, not to replace policy enforcement or access control.
1. Crash edge case: If an agent executes a side-effect and dies before signing the receipt, is that action orphaned? Any WAL-style intent/completion model?
2. Multi-step workflows: Do receipts chain natively (parent pointers/Merkle) or via external linking? (I see storage/ledgers are out of scope, but curious about the linkage design.)
The negative proof angle (proving AI didn't touch prod) is compelling for compliance.