diff --git a/zkvm/docs/zkvm-mempool.md b/zkvm/docs/zkvm-mempool.md new file mode 100644 index 000000000..e3385c94a --- /dev/null +++ b/zkvm/docs/zkvm-mempool.md @@ -0,0 +1,241 @@ +# ZkVM mempool + +## Background + +**Memory pool** is a data structure maintained by each peer for managing _unconfirmed transactions_. It decides which transactions to accept from other peers and relay further. + +Generally, transactions are sorted by _feerate_: the amount of fees paid per byte. Nodes choose some reasonable limits for their mempool sizes. As mempool becomes full, lowest-paying transactions are **evicted** from it. When a new block is created, it takes the highest-paying transactions. When nodes see a new block, they **clear** their mempools, removing confirmed transactions. + +What if transaction does not pay high enough fee? At best it’s not going to be relayed anywhere. +At worst, it’s going to be relayed and dropped by some nodes, and relayed again by others, etc. + +This situation poses two problems: + +1. **Denial of service risk:** low-fee transactions that barely make it to the mempool can get re-relayed many times over, consuming bandwidth of the network, while the same fee is amortized over all the relay cycles, lowering the cost of attack. +2. **Stuck transactions:** as nodes reject double-spend attempts, user may have to wait indefinitely until his low-fee transaction is either completely forgotten or finally published in a block. + +There are two ways to address stuck transactions: + +1. Replace the transaction with another one, spending the same outputs, but with a higher fee. This is known as **replace-by-fee** (RBF). This method has a practical downside to the user: one need to re-communicate blinding factors with the recipient when making an alternative tx. So in this implementation we do not support RBF at all. +2. Create a chained transaction that pays a higher fee to cover for itself and for the parent. This is known as **child pays for parent** (CPFP). This is implemented here. + +The DoS risk is primarily limited by requiring transactions pay not only for themselves, but also for +the cost of relaying the transactions that are being evicted. The evicted transaction is now unlikely to be confirmed, so the cost of relaying it must be covered by some other transaction. + +There is an additional problem, though. After the mempool is partially cleared by a newly published block, the previously evicted transaction may come back and will be relayed once again. +At first glance, it is not a problem because someone's transaction that cause the eviction has already paid for the first relay. However, for the creator of the transaction potentially unlimited number of relays comes at a constant (low) cost. This means, the network may have to relay twice as much traffic due to such bouncing transactions, and the actual users of the network may need to pay twice as much. + +To address this issue, we need to efficiently **remember the evicted transaction**. Then, to accept it again, we require it to have the _effective feerate_ = _minimum feerate_ + _flat feerate_. If the transaction pays by itself, it is fine to accept it again. The only transaction likely to return again and again is the one paying a very low fee, so the bump by flat feerate would force it to be paid via CPFP (parked and wait for a higher-paying child). + +## Definitions + +### Fee + +Amount paid by the transaction using the [`fee`](zkvm-spec.md#fee) instruction. +Fee is denominated in [Values](zkvm-spec.md#value-type) with flavor ID = 0. + +### Feerate + +A **fee rate** is a ratio of the [fees](#fees) divided by the size of the tx in bytes (`::encoded_length()`). + +Feerate is stored as a pair of integers `fee / size` so that feerates can be accurately [combined](#combine-feerates). + +### Self feerate + +A sum of all fees paid by a transaction, as reflected in the [transaction log](zkvm-spec.md#fee-entry), divided by the size of the transaction. + +### Combine feerates + +Operation over multiple feerates that produces an average [feerate](#feerate), preserving the total size of the transactions involved. + +`Combine(feerate1, feerate2) = (fee1 + fee2) / (size1 + size2)`. + +### Discount feerate + +Operation over a single feerate to discount its weight when [combined](#combine-feerates) with the [parent transaction](#parent): + +`Discount(feerate, n) = floor(fee/n) / floor(size/n)` + +### Parent + +Transaction that produced an output spent in a given transaction, which is a parent’s [child](#child). + +### Child + +Transaction that spends an output produced by a given transaction, which is its [parent](#parent). + +### Orphan + +Transaction that spends an output that does not exist. + +Orphans may be received because requests for transactions are spread evenly among the peers and can arrive in random order. +This offers a better use of bandwidth and simpler synchronization logic, but requires the node +to track orphan transactions separately in [peerpools](#peerpool). + + +### RBF + +"Replace by Fee". A policy that permits replacing one transaction by another, conflicting with it (spending one or more of the same outputs), if another pays a higher [feerate](#feerate). +This mempool implementation does not support any variant of RBF. + +### CPFP + +"Child Pays For Parent". A policy that prioritizes transactions by [effective feerate](#effective-feerate). + +### Total feerate + +A [feerate](#feerate) computed as a [combination](#combine-feerates) of feerates of a transaction, all its [children](#child) and their children, recursively. + +### Effective feerate + +A maximum between [self feerate](#self-feerate) and [total feerate](#total-feerate). + +### Flat feerate + +The minimum [feerate](#feerate) that every transaction must pay to be included in the mempool. Configured per node. + +### Depth + +Transaction has a depth equal to the maximum of the outputs it spends. + +Confirmed outputs have depth 0. + +Unconfirmed outputs have the same depth as the transaction. + +``` +0 ___ tx __ 1 +0 ___/ \__ 1 __ tx __ 2 +0 ______________/ \__ 2 +``` + +### Maximum depth + +Maximum [depth](#depth) of unconfirmed transactions allowed in the mempool. Configured per node. + +### Minimum feerate + +The maximum of [flat feerate](#flat-feerate) and the lowest [effective feerate](#effective-feerate) in the [mempool](#mempool), if it’s full. +For non-full mempool, it is the [flat feerate](#flat-feerate). + +### Required feerate + +For a given transaction and its feerate `R`, the required feerate is computed as follows: + +1. Compute the [minimum feerate](#minimum-feerate) `M`. +2. If transaction is present in [eviction filter](#eviction-filter), increase `M` by an extra [flat feerate](#flat-feerate), without changing the `M.size`: `M = M.fee + M.size*flat_fee / M.size` +3. The required absolute [effective fee](#fee) (not the _feerate_) is: `M * (M.size + R.size)`. + +### Peerpool + +A small buffer of transactions maintained per peer, used to park transactions with insufficient feerate (waiting for higher-paying [children](#child)) or [orphans](#orphan), waiting for [parents](#parent). + +Transactions in the peerpool are not relayed, and are dropped when the peer disconnects. + + +### Mempool + +A data structure that keeps a collection of transactions that are valid for inclusion in a block, +with reference to a current _tip_ and the corresponding Utreexo state. + +Mempool verifies incoming transactions and evicts low-priority transactions. +Mempool always keeps transactions sorted in topological order. + +Mempools are synchronized among peers, by sending the missing transactions to each other. +Duplicates are silently rejected. + + + +### Eviction filter + +Bloom filter that contains the evicted transactions and output IDs spent by them. + +Given a valid transaction with ID `T` that spends a set of outputs with IDs `{C}`: + +1. If `T` is in the filter: transaction is treated as previously evicted and an additional [flat feerate](#flat-feerate) is [required](#required-feerate). +2. If `T` is not in the filter, but one of output IDs `{C}` is in the filter: transaction is treated as a double spend and rejected (see also [RBF](#rbf)). +3. If neither `T`, nor `{C}` are in the filter: transaction is treated as a new one. + +If the false positive occurs at step 1: +a. either an ordinary transaction is required to pay a higher fee than others, +b. or it is a double-spend attempt after eviction that’s accidentally accepted by this node. + +If the false positive occurs at step 2: it is an ordinary transaction rejected from this mempool. +Other nodes have a different randomized state of bloom filter, so they are likely to relay it. + +Filter is reset every 24 hours in order to keep false positive rate low. + + +## Procedures + +### Relaying transactions + +A node periodically announces a set of its transactions to all the neighbours by transmitting a list of recently received transaction IDs. + +When a list of IDs is received from a peer, node detects IDs that are missing in its mempool and remembers them (per peer). + +Periodically, node sends out requests for transactions. It goes in round-robin, and collects lists of transactions, +avoiding request for the transactions it already assigned per node. Then, requests are sent out to all peers. + +Transactions arrive in random order, therefore some of them may be [orphans](#orphan). +Orphans are parked separately and indexed by the input Contract ID. +When an appropriate output is added, a corresponding orphan transaction is attempted again. +If it's still missing another parent, it is parked as an orphan again. + + + +### Accept to mempool + +**Transaction is validated statelessly per ZkVM rules.** The peer may be deprioritized or banned if it relays a statelessly invalid transaction. + +**Time bounds are checked against to the tip block timestamp.** +Transactions must use generous time bounds to account for clock differences. +This simplifies validation logic, as we don't need to allow windowing or check for self-consistency of unconfirmed tx chains. + +**Transaction is checked against [eviction filter](#eviction-filter).** +If it is a double-spend, it is rejected. +If it is coming back after eviction, a [required feerate](#required-feerate) is increased by [flat feerate](#flat-feerate). + +**If transaction expires soon** (`tx.maxtime` is less than tip timestamp + 24 hours), an additional [flat feerate](#flat-feerate) is required on top of the above. +This is because such transaction is more likely to expire and become invalid (unlike unbounded ones), while the network will have spent bandwidth on relaying it. + +**Transaction feerate is checked** against the [required feerate](#required-feerate). +If it is insufficient, transaction is [accepted to peerpool](#accept-to-peerpool) or discarded. + +**Transaction is applied to the mempool state.** +If any output is already spent, transaction is discarded. +If any output is missing, transaction is [accepted to peerpool](#accept-to-peerpool) or discarded. +If transaction’s depth is higher than [maximum depth](#maximum-depth), reject transaction. + +Once transaction is added to the mempool state, [effective feerates](#effective-feerate) of its [ancestors](#parent) are recomputed. + +**If the mempool size exceeds the maximum size**, a transaction with the lowest effective feerate is evicted, together with all its [descendants](#child). +The procedure repeats until the mempool size is lower than the maximum size. + +**Add to the [eviction filter](#eviction-filter)** IDs of the evicted transactions and the IDs of the outputs they were spending. + + +### Accept to peerpool + +If transaction’s depth is higher than [maximum depth](#maximum-depth), reject transaction. + +Check if transaction spends inputs correctly. If any output is spent or does not exist, reject transaction. + +Recompute effective feerates of ancestors of the newly inserted transaction. +If any passes the required feerate (considering eviction filter and maxtime), +move it and all its descendants with higher effective feerate than the parent’s to the mempool. + +While the peerpool size exceeds the maximum, remove the oldest (FIFO) transaction and all its descendants. + + + +## Notes + +The above design contains several design decisions worth pointing out: + +1. **Double spends are not allowed at any level.** This is, obviously, a hard rule for the blockchain, but it also means the replace-by-fee (RBF) is not allowed in mempools. The rationale is that child-pays-for-parent (CPFP) needs to be supported anyway, and replacing confidential transactions requires update of all blinding factors, which normally means another round of communication between the wallets. Also, handling fees when RBF happens across eviction and preventing subtle DoS scenarios is trickier than simply disallow RBF. **Do not** consider this design choice as an endorsement of 0-confirmation transactions; those do not become more secure because this policy is strictly focused on protecting the node itself and does not offer any security to other nodes. +2. **Single-mode relay with peerpools.** Transactions are assumed to be simply relayed in topological order, one by one. There is no separate "package relay" for CPFP. Txs with insufficient fees are parked in a per-peer buffer until a higher-paying child arrives. +3. **Discounted child feerate.** To simplify a [NP-complete task](https://freedom-to-tinker.com/2014/10/27/bitcoin-mining-is-np-hard/) of calculating an optimal subset of tx graph, effective feerate of a parent is computed by simply combining feerates of children. In case a child has several parents, we prevent overcounting by splitting its feerate among all parents. For the most cases it does not treat txs unfairly, but allows adding up feerates in a straightforward manner. + + + + diff --git a/zkvm/src/blockchain/errors.rs b/zkvm/src/blockchain/errors.rs index 393d3df30..0b117547c 100644 --- a/zkvm/src/blockchain/errors.rs +++ b/zkvm/src/blockchain/errors.rs @@ -32,3 +32,5 @@ pub enum BlockchainError { #[fail(display = "Utreexo operation failed.")] UtreexoError(UtreexoError), } + +// TODO: add mempool error enum. diff --git a/zkvm/src/blockchain/mempool.rs b/zkvm/src/blockchain/mempool.rs new file mode 100644 index 000000000..fcde48f3d --- /dev/null +++ b/zkvm/src/blockchain/mempool.rs @@ -0,0 +1,773 @@ +//! "Memory pool" is a data structure for managing _unconfirmed transactions_. +//! It decides which transactions to accept from other peers and relay further. +//! +//! Generally, transactions are sorted by _feerate_: the amount of fees paid per byte. +//! What if transaction does not pay high enough fee? At best it’s not going to be relayed anywhere. +//! At worst, it’s going to be relayed and dropped by some nodes, and relayed again by others, etc. +//! +//! This situation poses two problems: +//! 1. Denial of service risk: low-fee transactions that barely make it to the mempool +//! can get re-relayed many times over, consuming bandwidth of the network, +//! while the same fee is amortized over all the relay cycles, lowering the cost of attack. +//! 2. Stuck transactions: as nodes reject double-spend attempts, user may have to wait indefinitely +//! until his low-fee transaction is either completely forgotten or finally published in a block. +//! +//! There are two ways to address stuck transactions: +//! +//! 1. Replace the transaction with another one, with a higher fee. This is known as "replace-by-fee" (RBF). +//! This has a practical downside: one need to re-communicate blinding factors with the recipient when making an alternative tx. +//! So in this implementation we do not support RBF at all. +//! 2. Create a chained transaction that pays a higher fee to cover for itself and for the parent. +//! This is known as "child pays for parent" (CPFP). This is implemented here. +//! +//! The DoS risk is primarily limited by requiring transactions pay not only for themselves, but also for +//! the cost of relaying the transactions that are being evicted. The evicted transaction is now unlikely to be mined, +//! so the cost of relaying it must be covered by some other transaction. +//! +//! There is an additional problem, though. After the mempool is partially cleared by a newly published block, +//! the previously evicted transaction may come back and will be relayed once again. +//! At first glance, it is not a problem because someone's transaction that cause the eviction has already paid for the first relay. +//! However, for the creator of the transaction potentially unlimited number of relays comes at a constant (low) cost. +//! This means, the network may have to relay twice as much traffic due to such bouncing transactions, +//! and the actual users of the network may need to pay twice as much. +//! +//! To address this issue, we need to efficiently remember the evicted transaction. Then, to accept it again, +//! we require it to have the effective feerate = minimum feerate + flat feerate. If the transaction pays by itself, +//! it is fine to accept it again. The only transaction likely to return again and again is the one paying a very low fee, +//! so the bump by flat feerate would force it to be paid via CPFP (parked and wait for a higher-paying child). +//! +//! How do we "efficiently remember" evicted transactions? We will use a pair of bloom filters: one to +//! remember all the previously evicted tx IDs ("tx filter"), another one for all the outputs +//! that were spent by the evicted tx ("spends filter"). +//! When a new transaction attempts to spend an output marked in the filter: +//! 1. If the transaction also exists in the tx filter, then it is the resurrection of a previously evicted transaction, +//! and the usual rule with extra flat fee applies (low probablity squared that it's a false positive and we punish a legitimate tx). +//! 2. If the transaction does not exist in the tx filter, it is likely a double spend of a previously evicted tx, +//! and we outright reject it. There is a low chance (<1%) of false positive reported by the spends filter, but +//! if this node does not relay a legitimate transaction, other >99% nodes will since +//! all nodes initialize filters with random keys. +//! Both filters are reset every 24h. + +use core::cell::{Cell, RefCell}; +use core::cmp::{max, Ordering}; +use core::hash::Hash; +use core::mem; +use core::ops::{Deref, DerefMut}; + +use std::collections::HashMap; +use std::rc::{Rc, Weak}; +use std::time::Instant; + +use super::errors::BlockchainError; +use super::state::{check_tx_header, BlockchainState}; +use crate::merkle::Hasher; +use crate::tx::{Tx,TxHeader,TxLog,TxID}; +use crate::utreexo::{self, UtreexoError}; +use crate::ContractID; +use crate::FeeRate; +use crate::VerifiedTx; + +/// Mempool error conditions. +#[derive(Debug, Fail)] +pub enum MempoolError { + /// Occurs when a blockchain check failed. + #[fail(display = "Blockchain check failed.")] + BlockchainError(BlockchainError), + + /// Occurs when utreexo operation failed. + #[fail(display = "Utreexo operation failed.")] + UtreexoError(UtreexoError), + + /// Occurs when a transaction attempts to spend a non-existent unconfirmed output. + #[fail(display = "Transaction attempts to spend a non-existent unconfirmed output.")] + InvalidUnconfirmedOutput, + + /// Occurs when a transaction does not have a competitive fee and cannot be included in mempool. + #[fail( + display = "Transaction has low fee relative to all the other transactions in the mempool." + )] + LowFee, + + /// Occurs when a transaction spends too long chain of unconfirmed outputs, making it expensive to handle. + #[fail(display = "Transaction spends too long chain of unconfirmed outputs.")] + TooDeep, +} + +/// Trait for the items in the mempool. +#[derive(Clone,Debug)] +struct MempoolTx { + id: TxID, + rawtx: Tx, + utreexo_proofs: Vec, + txlog: TxLog, + feerate: FeeRate, +} + +impl MempoolTx { + fn header(&self) -> &TxHeader { + &self.rawtx.header + } +} + +/// Configuration of the mempool. +#[derive(Clone, Debug)] +pub struct Config { + /// Maximum size of mempool in bytes + pub max_size: usize, + + /// Maximum size of peerpool in bytes (to fit <100 transactions) + pub max_peerpool_size: usize, + + /// Maximum depth of unconfirmed transactions allowed. + /// 0 means node only allows spending confirmed outputs. + pub max_depth: usize, + + /// Minimum feerate required when the mempool is empty. + /// Transactions paying less than this are not relayed. + pub flat_feerate: FeeRate, +} + +/// Main API to the memory pool. +pub struct Mempool2 +where + PeerID: Hash + Eq + Clone, +{ + /// Current blockchain state. + state: BlockchainState, + + /// State of available outputs. + utxos: UtxoMap, + + /// Sorted in topological order + txs: Vec>, + peerpools: HashMap, + current_size: usize, + config: Config, + hasher: Hasher, +} + +struct Peerpool { + utxos: UtxoMap, + lru: Vec>, + current_size: usize, +} + +/// Node in the tx graph. +#[derive(Debug)] +struct Node { + // Actual transaction object managed by the mempool. + tx: RefCell>, + // The first time the tx was seen + seen_at: Instant, + // Cached total feerate. None when it needs to be recomputed. + cached_total_feerate: Cell>, + // List of input statuses corresponding to tx inputs. + inputs: Vec>, + // List of output statuses corresponding to tx outputs. + outputs: Vec>, +} + +#[derive(Debug)] +enum Input { + /// Input is marked as confirmed - we don't really care where in utreexo it is. + /// This is also used by peerpool when spending an output from the main pool, to avoid mutating updates. + Confirmed, + /// Parent tx and an index in parent.outputs list. + Unconfirmed(Rc, Index, Depth), +} + +#[derive(Debug)] +enum Output { + /// Currently unoccupied output. + Unspent, + + /// Child transaction and an index in child.inputs list. + /// Normally, the weakref is dropped at the same time as the strong ref, during eviction. + Spent(Weak, Index), +} + +/// Map of the utxo statuses from the contract ID to the spent/unspent status +/// of utxo and a reference to the relevant tx in the mempool. +type UtxoMap = HashMap; +type Depth = usize; +type Index = usize; + +/// Status of the utxo cached by the mempool +enum UtxoStatus { + /// Output is unspent and exists in the utreexo accumulator + Confirmed, + + /// Output is unspent and is located in the i'th output in the given unconfirmed tx. + Unconfirmed(Rc, Index, Depth), + + /// Output is marked as spent + Spent, +} + +impl Mempool2 +where + PeerID: Hash + Eq + Clone, +{ + /// Creates a new mempool with the given size limit and the current timestamp. + pub fn new(state: BlockchainState, mut config: Config) -> Self { + config.flat_feerate = config.flat_feerate.normalize(); + Mempool2 { + state, + utxos: HashMap::new(), + ordered_txs: Vec::with_capacity(config.max_size / 2000), + peerpools: HashMap::new(), + current_size: 0, + config, + hasher: utreexo::utreexo_hasher(), + } + } + + /// The fee paid by an incoming tx must cover with the minimum feerate both + /// the size of the incoming tx and the size of the evicted tx: + /// + /// new_fee ≥ min_feerate * (evicted_size + new_size). + /// + /// This method returns the effective feerate of the lowest-priority tx, + /// which also contains the total size that must be accounted for. + pub fn min_feerate(&self) -> FeeRate { + let actual_min_feerate = self + .ordered_txs + .first() + .and_then(|r| r.borrow().as_ref().map(|x| x.effective_feerate())) + .unwrap_or_default(); + + if self.is_full() { + max(actual_min_feerate, self.config.flat_feerate) + } else { + self.config.flat_feerate + } + } + + /// The fee paid by an incoming tx must cover with the minimum feerate both + /// the size of the incoming tx and the size of the evicted tx: + /// + /// `new_fee > min_feerate * (evicted_size + new_size)` + /// + /// This method returns the effective feerate of the lowest-priority tx, + /// which also contains the total size that must be accounted for. + /// + /// This is equivalent to: + /// + /// `new_fee*evicted_size > min_fee * (evicted_size + new_size)` + /// + pub fn is_feerate_sufficient(feerate: FeeRate, min_feerate: FeeRate) -> bool { + let mut evicted_size = min_feerate.size() as u64; + if evicted_size == 1 { + // special case when we have a normalized fee. + evicted_size = 0; + } + feerate.fee() * evicted_size >= min_feerate.fee() * (evicted_size + (feerate.size() as u64)) + } + + /// Adds a tx and evicts others, if needed. + pub fn try_append( + &mut self, + tx: MempoolTx, + peer_id: PeerID, + evicted_txs: &mut impl core::iter::Extend, + ) -> Result<(), MempoolError> { + // TODO: check if the tx must be applied to a peerpool, + // then add it there - it will otherwise fail in the main pool. + + if !Self::is_feerate_sufficient(tx.feerate(), self.min_feerate()) { + // TODO: try to add to peerpool + return Err(MempoolError::LowFee); + } + self.append(tx)?; + self.compact(evicted_txs); + Ok(()) + } + + /// Forgets peer and removes all associated parked transactions. + pub fn forget_peer(&mut self, peer_id: PeerID) { + self.peerpools.remove(&peer_id); + } + + /// Add a transaction to mempool. + /// Fails if the transaction attempts to spend a non-existent output. + /// Does not check the feerate and does not compact the mempool. + fn park_for_peer(&mut self, tx: MempoolTx, peer_id: PeerID) -> Result<(), MempoolError> { + check_tx_header( + &tx.verified_tx().header, + self.state.tip.timestamp_ms, + self.state.tip.version, + )?; + + let max_depth = self.config.max_depth; + let newtx = self + .peerpool_view(&peer_id) + .apply_tx(tx, max_depth, Instant::now())?; + + let pool = self.peerpools.entry(peer_id.clone()).or_default(); + + // Park the tx + pool.lru.push(newtx); + + // Find txs that become eligible for upgrade into the mempool + // and move them there. + + return Err(MempoolError::LowFee); + } + + /// Add a transaction to mempool. + /// Fails if the transaction attempts to spend a non-existent output. + /// Does not check the feerate and does not compact the mempool. + fn append(&mut self, tx: MempoolTx) -> Result<(), MempoolError> { + check_tx_header( + &tx.verified_tx().header, + self.state.tip.timestamp_ms, + self.state.tip.version, + )?; + + let tx_size = tx.feerate().size(); + let max_depth = self.config.max_depth; + let newtx = self + .mempool_view() + .apply_tx(tx, max_depth, Instant::now())?; + + self.ordered_txs.push(newtx); + self.order_transactions(); + + self.current_size += tx_size; + + Ok(()) + } + + /// Removes the lowest-feerate transactions to reduce the size of the mempool to the maximum allowed. + /// User may provide a buffer that implements Extend to collect and inspect all evicted transactions. + fn compact(&mut self, evicted_txs: &mut impl core::iter::Extend) { + while self.is_full() { + self.evict_lowest(evicted_txs); + } + } + + fn is_full(&self) -> bool { + self.current_size > self.config.max_size + } + + fn order_transactions(&mut self) { + self.ordered_txs + .sort_unstable_by(|a, b| Node::optional_cmp(&a.borrow(), &b.borrow())); + } + + /// Evicts the lowest tx and returns true if the mempool needs to be re-sorted. + /// If we evict a single tx or a simple chain of parents and children, then this returns false. + /// However, if there is a non-trivial graph, some adjacent tx may need their feerates recomputed, + /// so we need to re-sort the list. + fn evict_lowest(&mut self, evicted_txs: &mut impl core::iter::Extend) { + if self.ordered_txs.len() == 0 { + return; + } + + let lowest = self.ordered_txs.remove(0); + let (needs_reorder, total_evicted) = self.mempool_view().evict_tx(&lowest, evicted_txs); + self.current_size -= total_evicted; + + if needs_reorder { + self.order_transactions(); + } + } + + fn mempool_view(&mut self) -> MempoolView<'_> { + MempoolView { + map: &mut self.utxos, + utreexo: &self.state.utreexo, + hasher: &self.hasher, + } + } + + fn peerpool_view(&mut self, peer_id: &PeerID) -> PeerView<'_> { + let pool = self.peerpools.entry(peer_id.clone()).or_default(); + PeerView { + peermap: &mut pool.utxos, + mainmap: &self.utxos, + utreexo: &self.state.utreexo, + hasher: &self.hasher, + } + } +} + +impl Default for Peerpool { + fn default() -> Self { + Peerpool { + utxos: UtxoMap::new(), + lru: Vec::new(), + current_size: 0, + } + } +} + +impl Node { + + fn self_feerate(&self) -> FeeRate { + self.tx.feerate() + } + + fn effective_feerate(&self) -> FeeRate { + core::cmp::max(self.self_feerate(), self.total_feerate()) + } + + fn total_feerate(&self) -> FeeRate { + self.cached_total_feerate.get().unwrap_or_else(|| { + let fr = self.compute_total_feerate(); + self.cached_total_feerate.set(Some(fr)); + fr + }) + } + + /// The discount is a simplification that allows us to recursively add up descendants' feerates + /// without opening a risk of overcounting in case of diamond-shaped graphs. + /// This comes with a slight unfairness to users where a child of two parents + /// is not contributing fully to the lower-fee parent. + /// However, in common case this discount has no effect since the child spends only one parent. + fn discounted_effective_feerate(&self) -> FeeRate { + let unconfirmed_parents = self + .inputs + .iter() + .filter(|i| { + if let Input::Unconfirmed(_, _, _) = i { + true + } else { + false + } + }) + .count(); + self.effective_feerate().discount(unconfirmed_parents) + } + + fn compute_total_feerate(&self) -> FeeRate { + // go through all children and get their effective feerate and divide it by the number of parents + let mut result_feerate = self.self_feerate(); + for output in self.outputs.iter() { + if let Output::Spent(childref, _) = output { + if let Some(maybe_child) = childref.upgrade() { + if let Some(childtx) = maybe_child.borrow().as_ref() { + result_feerate = + result_feerate.combine(childtx.discounted_effective_feerate()); + } + } + } + } + result_feerate + } + + fn invalidate_cached_feerate(&self) { + self.cached_total_feerate.set(None); + for inp in self.inputs.iter() { + if let Input::Unconfirmed(srcref, _, _) = inp { + if let Some(srctx) = srcref.borrow().as_ref() { + srctx.invalidate_cached_feerate(); + } + } + } + } + + // Compares tx priorities. The Ordering::Less indicates that the transaction has lower priority. + fn cmp(&self, other: &Self) -> Ordering { + self.effective_feerate() + .cmp(&other.effective_feerate()) + .then_with(|| { + // newer txs -> lower priority + self.seen_at.cmp(&other.seen_at).reverse() + }) + } + + // Comparing optional nodes to account for eviction. + // Evicted nodes naturally have lower priority. + fn optional_cmp(a: &Option, b: &Option) -> Ordering { + match (a, b) { + (None, None) => Ordering::Equal, + (Some(_), None) => Ordering::Greater, + (None, Some(_)) => Ordering::Less, + (Some(a), Some(b)) => a.cmp(b), + } + } +} + +trait UtxoViewTrait { + /// Returns the status of the utxo for the given contract ID and a utreexo proof. + /// If the utxo status is not cached within the view, + /// utreexo proof is used to retrieve it from utreexo. + fn get( + &self, + contract_id: &ContractID, + proof: &utreexo::Proof, + ) -> Result; + + /// Stores the status of the utxo in the view. + fn set(&mut self, contract_id: ContractID, status: UtxoStatus); + + /// Removes the stored status + fn remove(&mut self, contract_id: &ContractID); + + /// Attempts to apply transaction changes + fn apply_tx( + &mut self, + tx: Tx, + max_depth: Depth, + seen_at: Instant, + ) -> Result, MempoolError> { + let mut utreexo_proofs = tx.utreexo_proofs().iter(); + + // Start by collecting the inputs statuses and failing early if any output is spent or does not exist. + // Important: do not perform any mutations until we check all of them. + let inputs = tx + .txlog() + .inputs() + .map(|cid| { + let utxoproof = utreexo_proofs.next().ok_or(UtreexoError::InvalidProof)?; + + match self.get(cid, utxoproof)? { + UtxoStatus::Confirmed => Ok(Input::Confirmed), + UtxoStatus::Unconfirmed(srctx, i, depth) => { + Ok(Input::Unconfirmed(srctx, i, depth)) + } + UtxoStatus::Spent => Err(MempoolError::InvalidUnconfirmedOutput), + } + }) + .collect::, _>>()?; + + // If this is 0, then we only spend confirmed outputs. + // unconfirmed ones start with 1. + let max_spent_depth = inputs + .iter() + .map(|inp| { + if let Input::Unconfirmed(_src, _i, depth) = inp { + *depth + } else { + 0 + } + }) + .max() + .unwrap_or(0); + + if max_spent_depth > max_depth { + return Err(MempoolError::TooDeep); + } + + let outputs = tx + .txlog() + .outputs() + .map(|_| Output::Unspent) + .collect::>(); + + let new_ref = Rc::new(Node { + seen_at, + cached_total_feerate: Cell::new(Some(tx.feerate)), + inputs, + outputs, + tx, + }); + + { + // we cannot have &Node before we pack it into a Ref, + // so we borrow it afterwards. + let _dummy = new_ref.borrow(); + let new_node = _dummy + .as_ref() + .expect("we just created it above, so it's safe to unwrap"); + + // At this point the spending was checked, so we can do mutating changes. + + // 1. link parents to the children (if the weakref to the parent is not nil) + // 2. mark spent utxos as spent + for (input_index, (input_status, cid)) in new_node + .inputs + .iter() + .zip(new_node.tx.txlog().inputs()) + .enumerate() + { + if let Input::Unconfirmed(srcref, output_index, _depth) = input_status { + if let Some(srctx) = srcref.borrow_mut().as_mut() { + srctx.outputs[*output_index] = + Output::Spent(Rc::downgrade(&new_ref), input_index); + srctx.invalidate_cached_feerate(); + } + } + self.set(*cid, UtxoStatus::Spent); + } + + // 3. add outputs as unspent. + for (i, cid) in new_node.tx.txlog().outputs().map(|c| c.id()).enumerate() { + self.set( + cid, + UtxoStatus::Unconfirmed(new_ref.clone(), i, max_spent_depth + 1), + ); + } + } + + Ok(new_ref) + } + + /// Evicts tx and its subchildren recursively, updating the utxomap accordingly. + /// Returns a flag indicating if we need to reorder txs, and the total number of bytes evicted. + fn evict_tx( + &mut self, + txref: &Rc, + evicted_txs: &mut impl core::iter::Extend, + ) -> (bool, usize) { + // 1. immediately mark the node as evicted, taking its Tx out of it. + // 2. for each input: restore utxos as unspent. + // 3. for each input: if unconfirmed and non-evicted, invalidate feerate and set the reorder flag. + // 4. recursively evict children. + // 5. for each output: remove utxo records. + + // TODO: if we evict a tx that's depended upon by some child parked in the peerpool - + // maybe put it there, or update the peerpool? + + let node: Node = match txref.borrow_mut().take() { + Some(node) => node, + None => return (false, 0), // node is already evicted. + }; + + let mut should_reorder = false; + + for (inp, cid) in node.inputs.into_iter().zip(node.tx.txlog().inputs()) { + match inp { + Input::Confirmed => { + // remove the Spent status in the view that shadowed the Utreexo state + self.remove(cid); + } + Input::Unconfirmed(srcref, i, depth) => { + if let Some(src) = srcref.borrow_mut().as_mut() { + should_reorder = true; + src.invalidate_cached_feerate(); + src.outputs[i] = Output::Unspent; + } + self.set(*cid, UtxoStatus::Unconfirmed(srcref, i, depth)); + } + } + } + + let mut evicted_size = node.tx.feerate().size(); + + for (out, cid) in node + .outputs + .into_iter() + .zip(node.tx.txlog().outputs().map(|c| c.id())) + { + if let Output::Spent(childweakref, _) = out { + if let Some(childref) = childweakref.upgrade() { + let (reorder, size) = self.evict_tx(&childref, evicted_txs); + should_reorder = should_reorder || reorder; + evicted_size += size; + } + } + // the output was marked as unspent during eviction of the child, and we simply remove it here. + self.remove(&cid); + } + evicted_txs.extend(Some(node.tx)); + (should_reorder, evicted_size) + } +} + +/// View into the state of utxos. +struct MempoolView<'a> { + map: &'a mut UtxoMap, + utreexo: &'a utreexo::Forest, + hasher: &'a Hasher, +} + +/// Peer's view has its own R/W map backed by the readonly main map. +/// The peer's map shadows the main mempool map. +struct PeerView<'a> { + peermap: &'a mut UtxoMap, + mainmap: &'a UtxoMap, + utreexo: &'a utreexo::Forest, + hasher: &'a Hasher, +} + +impl<'a> UtxoViewTrait for MempoolView<'a> { + fn get( + &self, + contract_id: &ContractID, + proof: &utreexo::Proof, + ) -> Result { + if let Some(status) = self.map.get(contract_id) { + Ok(status.clone()) + } else if let utreexo::Proof::Committed(path) = proof { + self.utreexo.verify(contract_id, path, &self.hasher)?; + Ok(UtxoStatus::Confirmed) + } else { + Err(MempoolError::InvalidUnconfirmedOutput) + } + } + + fn remove(&mut self, contract_id: &ContractID) { + self.map.remove(contract_id); + } + + /// Stores the status of the utxo in the view. + fn set(&mut self, contract_id: ContractID, status: UtxoStatus) { + // if we mark the unconfirmed output as spent, simply remove it from the map to avoid wasting space. + // this way we'll only store spent flags for confirmed and unspent flags for unconfirmed, while + // forgetting all intermediately consumed outputs. + if let UtxoStatus::Spent = status { + if let Some(UtxoStatus::Unconfirmed(_, _, _)) = self.map.get(&contract_id) { + self.map.remove(&contract_id); + return; + } + } + self.map.insert(contract_id, status); + } +} + +impl<'a> UtxoViewTrait for PeerView<'a> { + fn get( + &self, + contract_id: &ContractID, + proof: &utreexo::Proof, + ) -> Result { + if let Some(status) = self.peermap.get(contract_id) { + Ok(status.clone()) + } else if let Some(status) = self.mainmap.get(contract_id) { + // treat mainpool outputs as confirmed so we don't modify them + Ok(match status { + UtxoStatus::Confirmed => UtxoStatus::Confirmed, + UtxoStatus::Spent => UtxoStatus::Spent, + UtxoStatus::Unconfirmed(_txref, _i, _d) => UtxoStatus::Confirmed, + }) + } else if let utreexo::Proof::Committed(path) = proof { + self.utreexo.verify(contract_id, path, &self.hasher)?; + Ok(UtxoStatus::Confirmed) + } else { + Err(MempoolError::InvalidUnconfirmedOutput) + } + } + + fn remove(&mut self, contract_id: &ContractID) { + self.peermap.remove(contract_id); + } + + fn set(&mut self, contract_id: ContractID, status: UtxoStatus) { + self.peermap.insert(contract_id, status); + } +} + +// We are implementing the Clone manually because `#[derive(Clone)]` adds Clone bounds on `Tx` +impl Clone for UtxoStatus { + fn clone(&self) -> Self { + match self { + UtxoStatus::Confirmed => UtxoStatus::Confirmed, + UtxoStatus::Spent => UtxoStatus::Spent, + UtxoStatus::Unconfirmed(txref, i, d) => UtxoStatus::Unconfirmed(txref.clone(), *i, *d), + } + } +} + +impl From for MempoolError { + fn from(err: BlockchainError) -> Self { + MempoolError::BlockchainError(err) + } +} + +impl From for MempoolError { + fn from(err: UtreexoError) -> Self { + MempoolError::UtreexoError(err) + } +} diff --git a/zkvm/src/blockchain/mod.rs b/zkvm/src/blockchain/mod.rs index e47154424..9753968e7 100644 --- a/zkvm/src/blockchain/mod.rs +++ b/zkvm/src/blockchain/mod.rs @@ -2,6 +2,7 @@ mod block; mod errors; +mod mempool; mod state; #[cfg(test)] @@ -9,4 +10,5 @@ mod tests; pub use self::block::*; pub use self::errors::*; +pub use self::mempool::*; pub use self::state::*; diff --git a/zkvm/src/blockchain/state.rs b/zkvm/src/blockchain/state.rs index 833e30376..ad41d490b 100644 --- a/zkvm/src/blockchain/state.rs +++ b/zkvm/src/blockchain/state.rs @@ -263,7 +263,7 @@ where } /// Checks the tx header for consistency with the block header. -fn check_tx_header( +pub fn check_tx_header( tx_header: &TxHeader, timestamp_ms: u64, block_version: u64, diff --git a/zkvm/src/fees.rs b/zkvm/src/fees.rs index 43cba31ac..9a53110f7 100644 --- a/zkvm/src/fees.rs +++ b/zkvm/src/fees.rs @@ -12,7 +12,7 @@ pub struct CheckedFee { } /// Fee rate is a ratio of the transaction fee to its size. -#[derive(Copy, Clone, Debug, Serialize, Deserialize)] +#[derive(Copy, Clone, Default, Debug, Serialize, Deserialize)] pub struct FeeRate { fee: u64, size: u64, @@ -24,6 +24,11 @@ pub fn fee_flavor() -> Scalar { } impl FeeRate { + /// Creates a new zero feerate + pub fn zero() -> Self { + FeeRate::default() + } + /// Creates a new fee rate from a given fee and size. pub fn new(fee: CheckedFee, size: usize) -> Self { FeeRate { @@ -33,15 +38,47 @@ impl FeeRate { } /// Combines the fee rate with another fee rate, adding up the fees and sizes. - pub fn combine(&self, other: FeeRate) -> Self { + pub fn combine(self, other: FeeRate) -> Self { FeeRate { fee: self.fee + other.fee, size: self.size + other.size, } } + /// Normalizes feerate by dividing the fee by size rounding it down. + /// Yields a fee amount per 1 byte of size. + pub fn normalize(mut self) -> Self { + self.fee /= self.size; + self.size = 1; + self + } + + /// Increases the feerate by the given feerate, without changing the underlying size. + /// (Meaning the feerate added in normalized form, as amount of fee per 1 byte.) + pub fn increase_by(mut self, other: Self) -> Self { + self.fee += (other.fee * self.size) / other.size; + self + } + + /// Multiplies the feerate and returns a normalized feerate (with size=1). + pub fn mul(mut self, f: f64) -> Self { + self.fee = ((self.fee as f64 * f) / self.size as f64).round() as u64; + self.size = 1; + self + } + + /// Discounts the fee and the size by a given factor. + /// E.g. feerate 100/1200 discounted by 2 gives 50/600. + /// Same ratio, but lower weight when combined with other feerates. + pub fn discount(mut self, parts: usize) -> Self { + let parts = parts as u64; + self.fee /= parts; + self.size /= parts; + self + } + /// Converts the fee rate to a floating point number. - pub fn to_f64(&self) -> f64 { + pub fn to_f64(self) -> f64 { (self.fee as f64) / (self.size as f64) } diff --git a/zkvm/src/lib.rs b/zkvm/src/lib.rs index 87eb0e86e..49b510e42 100644 --- a/zkvm/src/lib.rs +++ b/zkvm/src/lib.rs @@ -36,6 +36,7 @@ pub use self::constraints::{Commitment, CommitmentWitness, Constraint, Expressio pub use self::contract::{Anchor, Contract, ContractID, PortableItem}; pub use self::encoding::Encodable; pub use self::errors::VMError; +pub use self::fees::FeeRate; pub use self::merkle::{Hash, MerkleItem, MerkleTree}; pub use self::ops::{Instruction, Opcode}; pub use self::predicate::{Predicate, PredicateTree}; diff --git a/zkvm/src/merkle.rs b/zkvm/src/merkle.rs index 2a45459a7..2c220ccb1 100644 --- a/zkvm/src/merkle.rs +++ b/zkvm/src/merkle.rs @@ -392,7 +392,7 @@ impl Encodable for Path { } } -/// Simialr to Path, but does not contain neighbors - only left/right directions +/// Similar to Path, but does not contain neighbors - only left/right directions /// as indicated by the bits in the `position`. #[derive(Copy, Clone, PartialEq, Debug)] pub struct Directions { diff --git a/zkvm/src/program.rs b/zkvm/src/program.rs index 62dab0cc0..c3f198262 100644 --- a/zkvm/src/program.rs +++ b/zkvm/src/program.rs @@ -175,8 +175,9 @@ impl Program { def_op!(issue, Issue, "issue"); def_op!(borrow, Borrow, "borrow"); def_op!(retire, Retire, "retire"); - def_op!(cloak, Cloak, usize, usize, "cloak:m:n"); + def_op!(fee, Fee, "fee"); + def_op!(input, Input, "input"); def_op!(output, Output, usize, "output:k"); def_op!(contract, Contract, usize, "contract:k"); diff --git a/zkvm/src/tx.rs b/zkvm/src/tx.rs index d96cade87..333ddb60c 100644 --- a/zkvm/src/tx.rs +++ b/zkvm/src/tx.rs @@ -83,7 +83,7 @@ pub struct UnsignedTx { } /// Instance of a transaction that contains all necessary data to validate it. -#[derive(Clone, Serialize, Deserialize)] +#[derive(Clone, Debug, Serialize, Deserialize)] pub struct Tx { /// Header metadata pub header: TxHeader, @@ -268,6 +268,22 @@ impl TxLog { pub fn push(&mut self, item: TxEntry) { self.0.push(item); } + + /// Iterator over the input entries + pub fn inputs(&self) -> impl Iterator { + self.0.iter().filter_map(|entry| match entry { + TxEntry::Input(contract_id) => Some(contract_id), + _ => None, + }) + } + + /// Iterator over the output entries + pub fn outputs(&self) -> impl Iterator { + self.0.iter().filter_map(|entry| match entry { + TxEntry::Output(contract) => Some(contract), + _ => None, + }) + } } impl From> for TxLog { diff --git a/zkvm/src/utreexo/forest.rs b/zkvm/src/utreexo/forest.rs index 3c375b5b9..db63cf9df 100644 --- a/zkvm/src/utreexo/forest.rs +++ b/zkvm/src/utreexo/forest.rs @@ -82,6 +82,25 @@ impl Forest { .fold(0u64, |total, (level, _)| total + (1 << level)) } + /// Verifies that the given item and a path belong to the forest. + pub fn verify( + &self, + item: &M, + path: &Path, + hasher: &Hasher, + ) -> Result<(), UtreexoError> { + let computed_root = path.compute_root(item, hasher); + if let Some((_i, level)) = + find_root(self.roots_iter().map(|(level, _)| level), path.position) + { + // unwrap won't fail because `find_root` returns level for the actually existing root. + if self.roots[level].unwrap() == computed_root { + return Ok(()); + } + } + Err(UtreexoError::InvalidProof) + } + /// Lets use modify the utreexo and yields a new state of the utreexo, /// along with a catchup structure. pub fn work_forest(&self) -> WorkForest {