[bitcoin-dev] A Better MMR Definition

Discussion:

Peter Todd via bitcoin-dev

2017-02-23 01:15:06 UTC

Reposting something that came up recently in a private discussion with some
academics:

Concretely, let's define a prunable MMR with the following grammar. This
definition is an improvement on whats in the python-proofmarshal by committing
to the number of items in the tree implicitly; an obvious max-log2(n)-sized
proof-of-tree-size can be obtained by following the right-most nodes:

Maybe(T) := UNPRUNED <T> | PRUNED <Commitment(T)>

FullNode(0) := <Value>
FullNode(n) := <Maybe(FullNode(n-1)> <Maybe(FullNode(n-1))>

PartialNode(0) := SOME <FullNode(0)> | NONE
PartialNode(n) := <Maybe(FullNode(n-1))> <Maybe(PartialNode(n-1))>

MMR := FULL <N> <FullNode(n)> | PARTIAL <N> <PartialNode(n)>

Basically we define it in four parts. First we define Maybe(T) to represent
pruned and unpruned (hash only) data. Secondly we define full nodes within 2^n
sized trees. Third we define partial nodes. And finally we define the MMR
itself as being either a full or partial node.

First of all, with pruning we can define a rule that if any operation (other
than checking commitment hashes) attempts to access pruned data, it should
immediately fail. In particular, no operation should be able to determine if
data is or isn't pruned. Equally, note how an implementation can keep track of
what data was accessed during any given operation, and prune the rest, which
means a proof is just the parts of the data structure accessed during one or
more operations.

With that, notice how proving the soundness of the proofs becomes trivial: if
validation is deterministic, it is obviously impossible to construct two
different proofs that prove contradictory statements, because a proof is simply
part of the data structure itself. Contradiction would imply that the two
proofs are different, but that's easily rejected by simply checking the hash of
the data.

--
https://petertodd.org 'peter'[:-1]@petertodd.org

Bram Cohen via bitcoin-dev

2017-02-23 03:07:08 UTC

Permalink

On Wed, Feb 22, 2017 at 5:15 PM, Peter Todd via bitcoin-dev <

Post by Peter Todd via bitcoin-dev
With that, notice how proving the soundness of the proofs becomes trivial: if
validation is deterministic, it is obviously impossible to construct two
different proofs that prove contradictory statements, because a proof is simply
part of the data structure itself. Contradiction would imply that the two
proofs are different, but that's easily rejected by simply checking the hash of
the data.

My code works this way. Proofs are serialization of a subset of the tree,
and to validate a proof you ask a single function whether a particular
value is included in that tree subset, and it answers yes or no, so
obviously it's impossible for a single value to both validate and not
validate. The proof code was quite terrifying before I made this change
(which I did on your suggestion), and it's much cleaner and simpler now. It
also in principle supports compact proofs of multiple inclusions and
exclusions in the same serialization of a subset of the tree because the
upper branches won't have to be repeated. I haven't written code for
generating those, but the validation code will happily accept them.

I'm not sure what you mean by MMRs though. Are you talking about MMRs where
each mountain is a set of diffs to the old things and are periodically
consolidated? Or do later mountains refer to internals of earlier ones? Or
do they have 'maybe' values which mean that the earlier mountain should be
referred to? Are these patricia tries or something flatter and more fixed
depth?

My code doesn't keep track of tree size, by the way. It would be trivial to
add that functionality to the library, and including it in the hashing
creates complexity and doesn't seem to have any benefit over sending that
data in a side channel.

Peter Todd via bitcoin-dev

2017-02-23 07:41:37 UTC

Permalink

Post by Bram Cohen via bitcoin-dev
On Wed, Feb 22, 2017 at 5:15 PM, Peter Todd via bitcoin-dev <

That's an improvement, but I think we can do even better if we think of missing
pruned data as analogous to virtual memory: pruned data is the same as a page
that has been swapped to disk, with the magical property that hashing allows us
to swap it back in from an untrusted source.

Thus a proof should actually be whatever data we expect our counterparty to
have flushed, ranging from none at all, to 100% (modulo a root hash). An
implementation should then do operations as normal, using parts of the proof on
an as-needed basis where pruned data is encountered.

Thus if you have a key-value map and do a get() operation, you'd expect the
proof to *not* be what the get operates on, but rather be a *context* argument
to the get() operation. The other way around is actually an example of doing
computations on untrusted data, and bad API design!

Post by Bram Cohen via bitcoin-dev
I'm not sure what you mean by MMRs though. Are you talking about MMRs where
each mountain is a set of diffs to the old things and are periodically
consolidated? Or do later mountains refer to internals of earlier ones? Or
do they have 'maybe' values which mean that the earlier mountain should be
referred to? Are these patricia tries or something flatter and more fixed
depth?

I'm talking about these MMR's: https://github.com/proofchains/python-proofmarshal/blob/master/proofmarshal/mmr.py

Notably I'm talking about an insertion ordered list, indexed by position, that
supports append and update operations, but *not* insertions; this is different
than what you've recently published re: UTXO commitments. That's a full
key-value map, something MMR's are delibrately are not doing.

Draw out a MMR based on the formal definition you're replying too and you'll
see the new structure.

Post by Bram Cohen via bitcoin-dev
My code doesn't keep track of tree size, by the way. It would be trivial to
add that functionality to the library, and including it in the hashing
creates complexity and doesn't seem to have any benefit over sending that
data in a side channel.

Like I say above, you're solving a different problem than MMR's solve.

--
https://petertodd.org 'peter'[:-1]@petertodd.org

Chris Priest via bitcoin-dev

2017-02-23 17:53:58 UTC

Permalink

On 2/22/17, Peter Todd via bitcoin-dev

Post by Peter Todd via bitcoin-dev
Reposting something that came up recently in a private discussion with some
Concretely, let's define a prunable MMR with the following grammar. This
definition is an improvement on whats in the python-proofmarshal by committing
to the number of items in the tree implicitly; an obvious max-log2(n)-sized
Maybe(T) := UNPRUNED <T> | PRUNED <Commitment(T)>
FullNode(0) := <Value>
FullNode(n) := <Maybe(FullNode(n-1)> <Maybe(FullNode(n-1))>
PartialNode(0) := SOME <FullNode(0)> | NONE
PartialNode(n) := <Maybe(FullNode(n-1))> <Maybe(PartialNode(n-1))>
MMR := FULL <N> <FullNode(n)> | PARTIAL <N> <PartialNode(n)>
Basically we define it in four parts. First we define Maybe(T) to represent
pruned and unpruned (hash only) data. Secondly we define full nodes within 2^n
sized trees. Third we define partial nodes. And finally we define the MMR
itself as being either a full or partial node.
First of all, with pruning we can define a rule that if any operation (other
than checking commitment hashes) attempts to access pruned data, it should
immediately fail. In particular, no operation should be able to determine if
data is or isn't pruned. Equally, note how an implementation can keep track of
what data was accessed during any given operation, and prune the rest, which
means a proof is just the parts of the data structure accessed during one or
more operations.
With that, notice how proving the soundness of the proofs becomes trivial: if
validation is deterministic, it is obviously impossible to construct two
different proofs that prove contradictory statements, because a proof is simply
part of the data structure itself. Contradiction would imply that the two
proofs are different, but that's easily rejected by simply checking the hash of
the data.
--

What problem does this try to solve, and what does it have to do with bitcoin?

Peter Todd via bitcoin-dev

2017-02-23 18:19:29 UTC

Permalink

Post by Chris Priest via bitcoin-dev
On 2/22/17, Peter Todd via bitcoin-dev

What problem does this try to solve, and what does it have to do with bitcoin?

See the discussion on TXO commitments for how MMR's could be used; a better MMR
makes for a better TXO commitment.

--
https://petertodd.org 'peter'[:-1]@petertodd.org

G. Andrew Stone via bitcoin-dev

2017-02-23 18:28:18 UTC

Permalink

Can an insertion ordered MMR allow an efficient nonexistence proof?
On Feb 23, 2017 1:20 PM, "Peter Todd via bitcoin-dev" <

Post by Peter Todd via bitcoin-dev

Post by Chris Priest via bitcoin-dev
On 2/22/17, Peter Todd via bitcoin-dev

Post by Peter Todd via bitcoin-dev
Reposting something that came up recently in a private discussion with

some

Post by Chris Priest via bitcoin-dev

Post by Peter Todd via bitcoin-dev
Concretely, let's define a prunable MMR with the following grammar.

This

Post by Chris Priest via bitcoin-dev

Post by Peter Todd via bitcoin-dev
definition is an improvement on whats in the python-proofmarshal by committing
to the number of items in the tree implicitly; an obvious

max-log2(n)-sized

Post by Chris Priest via bitcoin-dev
What problem does this try to solve, and what does it have to do with

bitcoin?
See the discussion on TXO commitments for how MMR's could be used; a better MMR
makes for a better TXO commitment.
--
_______________________________________________
bitcoin-dev mailing list
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev

Peter Todd via bitcoin-dev

2017-02-23 18:31:40 UTC

Permalink

Post by G. Andrew Stone via bitcoin-dev
Can an insertion ordered MMR allow an efficient nonexistence proof?

Why do you want a non-existance proof?

It supports an efficient *spentness* proof, which is sufficient for what we
need in Bitcoin, and much more scalable.

--
https://petertodd.org 'peter'[:-1]@petertodd.org

Bram Cohen via bitcoin-dev

2017-02-23 23:13:43 UTC

Permalink

On Thu, Feb 23, 2017 at 9:53 AM, Chris Priest via bitcoin-dev <

Post by Chris Priest via bitcoin-dev
What problem does this try to solve, and what does it have to do with bitcoin?

I can't speak to MMRs (they look a bit redundant with the actual blockchain
history to my eye) but circling back to utxo commitments, the benefits are
that it enables actual proofs of non-fraud: You can prove the validity of a
block based on just the previous block (and maybe some previous headers
because of mining rewards) and can prove to a light node that a utxo hasn't
been spent yet.

A major factor in the way of getting utxo commitments in blocks is
performance. The txo set is of course vastly larger and more unwieldy. If
you make the utxo commitments trail by a small fixed number of blocks
(between 2 and 5) their latency problems shouldn't be a big deal as long as
the overall performance is good enough. My thesis is that with appropriate
format and implementation tricks it's possible to get performance good
enough to no longer be a gating factor to deployment.

Disappointingly there hasn't been any feedback about my implementation,
just discussion about merkle sets generally.

Peter Todd via bitcoin-dev

2017-02-23 23:51:05 UTC

Permalink

Post by Bram Cohen via bitcoin-dev
On Thu, Feb 23, 2017 at 9:53 AM, Chris Priest via bitcoin-dev <

Post by Chris Priest via bitcoin-dev
What problem does this try to solve, and what does it have to do with bitcoin?

I can't speak to MMRs (they look a bit redundant with the actual blockchain
history to my eye) but circling back to utxo commitments, the benefits are

In what way do you see MMRs as redundant?

Remember that with UTXO commitments because access patterns are uniform, you'll
over time have a lot more "redundancy" in the form of lost-coins evenly spread
out across the whole keyspace.

Post by Bram Cohen via bitcoin-dev
that it enables actual proofs of non-fraud: You can prove the validity of a
block based on just the previous block (and maybe some previous headers
because of mining rewards) and can prove to a light node that a utxo hasn't
been spent yet.
A major factor in the way of getting utxo commitments in blocks is
performance. The txo set is of course vastly larger and more unwieldy. If

That statement is incorrect with pruning: you can maintain a commitment to the
TXO set, without actually storing the entire TXO set, because you don't need to
store anything for nodes that have already been spent.

Concretely, this can be done with nothing more than adding a FullySpent node
type to the MMR definition I published earlier, with the rule being that only a
left or right child of an inner node be a FullySpent node, not both; if both
sides are spent, the inner node itself becomes FullySpent. Equally, I think you
can re-use the Empty node for this, but I need to think a little about the
implications re: partial inner nodes.

Regardless, with a generalized commitment scheme, the serialization/commitment
to an Empty node is simply '0', the encoding of an unspent txout surrounded by
spent txouts will be similar in size to a position integer followed by the
txout...

A subtlety of this construction is that you can only prove that a specific
txout # is unspent, but that's actually sufficient, as you can also prove what
# a txout txid corresponds too with a previous version of the MMR.

Post by Bram Cohen via bitcoin-dev
you make the utxo commitments trail by a small fixed number of blocks
(between 2 and 5) their latency problems shouldn't be a big deal as long as
the overall performance is good enough. My thesis is that with appropriate
format and implementation tricks it's possible to get performance good
enough to no longer be a gating factor to deployment.
Disappointingly there hasn't been any feedback about my implementation,
just discussion about merkle sets generally.

Well, I think at this point there's still discussion over whether or not a UTXO
set commitment is the right approach to begin with; if it's not your
implementation isn't relevant.

--
https://petertodd.org 'peter'[:-1]@petertodd.org

Bram Cohen via bitcoin-dev

2017-02-24 00:49:01 UTC

Permalink

Post by Bram Cohen via bitcoin-dev

Post by Bram Cohen via bitcoin-dev
I can't speak to MMRs (they look a bit redundant with the actual

blockchain

Post by Bram Cohen via bitcoin-dev
history to my eye) but circling back to utxo commitments, the benefits

are
In what way do you see MMRs as redundant?

You can readily prove something is in the TXO or STXO set using the actual
blockchain, and the proofs will be nice and compact because even light
nodes are expected to already have all the historical headers.

What you can't do with MMRs or the blockchain is make a compact proof that
something is still in the utxo set, which is the whole point of utxo
commitments.

It's totally reasonable for full nodes to independently update and
recalculate the utxo set as part of their validation process. The same
can't be done for a balanced version of the txo set because it's too big.
Relying on proofs as a crutch for using the full txo set would badly
exacerbate the already extant problem of miners doing spv mining, and
increase the bandwidth a full validating node had to use by a multiple.

This whole conversation is badly sidetracked. If people have comments on my
merkle set I'd like to engage further with them, but mmrs need to be argued
independently on their own merits before being used as a counterpoint to
utxo commitments.

Peter Todd via bitcoin-dev

2017-02-24 01:09:43 UTC

Permalink

Post by Bram Cohen via bitcoin-dev

Post by Bram Cohen via bitcoin-dev
I can't speak to MMRs (they look a bit redundant with the actual

blockchain

Post by Bram Cohen via bitcoin-dev
history to my eye) but circling back to utxo commitments, the benefits

are
In what way do you see MMRs as redundant?

I think you've misunderstood what TXO commitments are. From my article:

"A merkle tree committing to the state of all transaction outputs, both spent
and unspent, can provide a method of compactly proving the current state of an
output."
-https://petertodd.org/2016/delayed-txo-commitments#txo-commitments:

I'm proposing that we commit to not just the set of transaction outputs, but
also the current *state* of those outputs, with the same commitment structure.

Concretely, each leaf node in the TXO commitment tree needs to commit to - at
minimum - the outpoint (txid:n) and spent/unspent status (possibly structurally
as mentioned elsewhere in this thread). It's probably also valuable to commit
to the scriptPubKey, nValue, as well, though technically that's redundant as
the txid already commits to that (there's some implementation options here).

Post by Bram Cohen via bitcoin-dev
It's totally reasonable for full nodes to independently update and
recalculate the utxo set as part of their validation process. The same
can't be done for a balanced version of the txo set because it's too big.

Why would you commit to a balanced version of the TXO set? I'm proposing
committing to an insertion-ordered list, indexed by txout #.

Post by Bram Cohen via bitcoin-dev
Relying on proofs as a crutch for using the full txo set would badly
exacerbate the already extant problem of miners doing spv mining, and
increase the bandwidth a full validating node had to use by a multiple.
This whole conversation is badly sidetracked. If people have comments on my
merkle set I'd like to engage further with them, but mmrs need to be argued
independently on their own merits before being used as a counterpoint to
utxo commitments.

Hmm? That's exactly what I'm doing. Also, as per the above, I think you've
misunderstood what my TXO commitment proposal is.

--
https://petertodd.org 'peter'[:-1]@petertodd.org

Bram Cohen via bitcoin-dev

2017-02-24 02:50:10 UTC

Permalink

Post by Peter Todd via bitcoin-dev
"A merkle tree committing to the state of all transaction outputs, both spent
and unspent, can provide a method of compactly proving the current state of an
output."

The proposal on that page is of a tree which does require random access
updates, it just positions entries in the order they happened to be added
instead of sorting by their hash. Once you start updating it to indicate
spent status all the exact same issues of TXO size and cache coherence on
updates show up again, but now you're using a more complex bespoke data
structure instead of a basic fundamental one.

Peter Todd via bitcoin-dev

2017-02-24 02:58:11 UTC

Permalink

Post by Bram Cohen via bitcoin-dev

So to be clear, do you agree or disagree with me that you *can* extract a
compact proof from a MMR that a given output is unspent?

I just want to make sure we're on the same page here before we discuss
performance characteristics.

--
https://petertodd.org 'peter'[:-1]@petertodd.org

Bram Cohen via bitcoin-dev

2017-02-24 03:02:36 UTC

Permalink

Post by Peter Todd via bitcoin-dev
So to be clear, do you agree or disagree with me that you *can* extract a
compact proof from a MMR that a given output is unspent?

After wading through your logic on how updates are done, I agree that that
can be done, but apples to apples compact proofs can also be done in a utxo
commitment, and proofs of the validity of updates can be done in a utxo
commitment, so there isn't any performance advantage to all that extra
complexity.

Peter Todd via bitcoin-dev

2017-02-24 03:15:31 UTC

Permalink

Post by Bram Cohen via bitcoin-dev

Post by Peter Todd via bitcoin-dev
So to be clear, do you agree or disagree with me that you *can* extract a
compact proof from a MMR that a given output is unspent?

Glad we're on the same page with regard to what's possible in TXO commitments.

Secondly, am I correct in saying your UTXO commitments scheme requires random
access? While you describe it as a "merkle set", obviously to be merkelized
it'll have to have an ordering of some kind. What do you propose that ordering
to be?

Maybe more specifically, what exact values do you propose to be in the set?

--
https://petertodd.org 'peter'[:-1]@petertodd.org

Bram Cohen via bitcoin-dev

2017-02-24 03:32:43 UTC

Permalink

Post by Peter Todd via bitcoin-dev
Glad we're on the same page with regard to what's possible in TXO commitments.
Secondly, am I correct in saying your UTXO commitments scheme requires random
access? While you describe it as a "merkle set", obviously to be merkelized
it'll have to have an ordering of some kind. What do you propose that ordering
to be?

The ordering is by the bits in the hash. Technically it's a Patricia Trie.
I'm using 'merkle tree' to refer to basically anything with a hash root.

Post by Peter Todd via bitcoin-dev
Maybe more specifically, what exact values do you propose to be in the set?

That is unspecified in the implementation, it just takes a 256 bit value
which is presumably a hash of something. The intention is to nail down a
simple format and demonstrate good performance and leave those semantics to
a higher layer. The simplest thing would be to hash together the txid and
output number.

Peter Todd via bitcoin-dev

2017-02-24 04:36:13 UTC

Permalink

Post by Bram Cohen via bitcoin-dev

The ordering is by the bits in the hash. Technically it's a Patricia Trie.
I'm using 'merkle tree' to refer to basically anything with a hash root.

The hash of what? The values in the set?

Post by Bram Cohen via bitcoin-dev

Post by Peter Todd via bitcoin-dev
Maybe more specifically, what exact values do you propose to be in the set?

Ok, so let's assume the values in the set are the unspent outpoints.

Since we're ordering by the hash of the values in the set, outpoints will be
distributed uniformly in the set, and thus the access pattern of data in the
set is uniform.

Now let's fast-forward 10 years. For the sake of argument, assume that for
every 1 UTXO in the set that corresponds to funds in someone's wallet that are
likely to be spent, there are 2^12 = 4096 UTXO's that have been permanently
lost (and/or created in spam attacks) and thus will never be spent.

Since lost UTXO's are *also* uniformly distributed, if I'm processing a new
block that spends 2^12 = 4096 UTXO's, on average for each UTXO spent, I'll
have to update log2(4096) = 12 more digests than I would have had those "dead"
UTXO's not existed.

Concretely, imagine our UTXO set had just 8 values in it, and we were updating
two of them:

#
/ \
/ \
/ \
/ \
/ \
# #
/ \ / \
/ \ / \
# . . #
/ \ / \ / \ / \
. X . . . . X .

To mark two coins as spent, we've had to update 5 inner nodes.

Now let's look at what happens in an insertion-ordered TXO commitment scheme.
For sake of argument, let's assume the best possible case, where every UTXO
spent in that same block was recently created. Since the UTXO's are recently
created, chances are almost every single one of those "dead" UTXO's will have
been created in the past. Thus, since this is an insertion-ordered data
structure, those UTXO's exist in an older part of the data structure that our
new block doesn't need to modify at all.

Concretely, again let's imagine a TXO commitment with 8 values in it, and two
of them being spent:

#
/ \
/ \
/ \
/ \
/ \
. #
/ \ / \
/ \ / \
. . . #
/ \ / \ / \ / \
. . . . . . X X

To mark two coins as spent, we've only had to update 3 inner nodes; while our
tree is higher with those lost coins, those extra inner nodes are amortised
across all the coins we have to update.

The situation gets even better when we look at the *new* UTXO's that our block
creates. Suppose our UTXO set has size n. To mark a single coin as spent, we
have to update log2(n) inner nodes. We do get to amortise this a bit at the top
levels in the tree, but even if we assume the amortisation is totally free,
we're updating at least log2(n) - log2(m) inner nodes "under" the amortised
nodes at the top of the tree for *each* new node.

Meanwhile with an insertion-ordered TXO commitment, each new UTXO added to the
data set goes in the same place - the end. So almost none of the existing data
needs to be touched to add the new UTXOs. Equally, the hashing required for the
new UTXO's can be done in an incremental fashion that's very L1/L2 cache
friendly.

tl;dr: Precisely because access patterns in TXO commitments are *not* uniform,
I think we'll find that from a L1/L2/etc cache perspective alone, TXO
commitments will result in better performance than UTXO commitments.

Now it is true that Bitcoin's current design means we'll need a map of
confirmed outpoints to TXO insertion order indexes. But it's not particularly
hard to add that "metadata" to transactions on the P2P layer in the same way
that segwit added witnesses to transactions without modifying how txids were
calculated; if you only connect to peers who provide you with TXO index
information in blocks and transactions, you don't need to keep that map
yourself.

Finally, note how this makes transactions *smaller* in many circumstances: it's
just a 8-byte max index rather than a 40 byte outpoint.

--
https://petertodd.org 'peter'[:-1]@petertodd.org

Bram Cohen via bitcoin-dev

2017-02-24 22:20:19 UTC

Permalink

So your idea is to cluster entries by entry time because newer things are
more likely to leave and updating multiple things near each other is
cheaper?

That can be done with my tool. Instead of using hashes for the values being
stored, you use position entries. The first entry gets a value of all
zeros, the next one a one followed by all zeros, then the next two
correspond to the first two with the second bit flipped to one, then the
next four the first four with the third bit flipped to one, etc. It
probably performs a little bit better to do it two bits at a time instead
of one so that the entries are 00, 01, 10, 11, 0001, 0010, 0011, 0101,
0110, 0111, 1001, etc. If you were to really use this you'd probably want
to to add some optimizations to use the fact that the terminals fit in 64
bits instead of 256, but it mostly works unchanged, and gets whatever
benefits there are to this clustering plus the high performance
implementation tricks I've built which I keep complaining that nobody's
giving feedback on.

I'm not sold on this being a win: The empirical access patterns are
unknown, it requires an extra cache miss per lookup to find the entry
number, it may be that everything is optimized well enough without it for
there to be no meaningful gains, and it's a bunch of extra complexity. What
should be done is that a plain vanilla UTXO set solution is optimized as
well as it can be first, and then the insertion ordering trick is tried as
an optimization to see if it's an improvement. Without that baseline
there's no meaningful basis for comparison, and I'm quite confident that a
naive implementation which just allocates individual nodes will
underperform the thing I've come up with, even without adding optimizations
related to fitting in 64 bits.

Post by Peter Todd via bitcoin-dev

Post by Bram Cohen via bitcoin-dev

merkelized

Post by Bram Cohen via bitcoin-dev

Post by Peter Todd via bitcoin-dev
it'll have to have an ordering of some kind. What do you propose that ordering
to be?

The ordering is by the bits in the hash. Technically it's a Patricia

Trie.

Post by Bram Cohen via bitcoin-dev
I'm using 'merkle tree' to refer to basically anything with a hash root.

The hash of what? The values in the set?

Post by Bram Cohen via bitcoin-dev

Post by Peter Todd via bitcoin-dev
Maybe more specifically, what exact values do you propose to be in the

set?

Post by Bram Cohen via bitcoin-dev
That is unspecified in the implementation, it just takes a 256 bit value
which is presumably a hash of something. The intention is to nail down a
simple format and demonstrate good performance and leave those semantics

Post by Bram Cohen via bitcoin-dev
a higher layer. The simplest thing would be to hash together the txid and
output number.

Ok, so let's assume the values in the set are the unspent outpoints.
Since we're ordering by the hash of the values in the set, outpoints will be
distributed uniformly in the set, and thus the access pattern of data in the
set is uniform.
Now let's fast-forward 10 years. For the sake of argument, assume that for
every 1 UTXO in the set that corresponds to funds in someone's wallet that are
likely to be spent, there are 2^12 = 4096 UTXO's that have been permanently
lost (and/or created in spam attacks) and thus will never be spent.
Since lost UTXO's are *also* uniformly distributed, if I'm processing a new
block that spends 2^12 = 4096 UTXO's, on average for each UTXO spent, I'll
have to update log2(4096) = 12 more digests than I would have had those "dead"
UTXO's not existed.
Concretely, imagine our UTXO set had just 8 values in it, and we were updating
#
/ \
/ \
/ \
/ \
/ \
# #
/ \ / \
/ \ / \
# . . #
/ \ / \ / \ / \
. X . . . . X .
To mark two coins as spent, we've had to update 5 inner nodes.
Now let's look at what happens in an insertion-ordered TXO commitment scheme.
For sake of argument, let's assume the best possible case, where every UTXO
spent in that same block was recently created. Since the UTXO's are recently
created, chances are almost every single one of those "dead" UTXO's will have
been created in the past. Thus, since this is an insertion-ordered data
structure, those UTXO's exist in an older part of the data structure that our
new block doesn't need to modify at all.
Concretely, again let's imagine a TXO commitment with 8 values in it, and two
#
/ \
/ \
/ \
/ \
/ \
. #
/ \ / \
/ \ / \
. . . #
/ \ / \ / \ / \
. . . . . . X X
To mark two coins as spent, we've only had to update 3 inner nodes; while our
tree is higher with those lost coins, those extra inner nodes are amortised
across all the coins we have to update.
The situation gets even better when we look at the *new* UTXO's that our block
creates. Suppose our UTXO set has size n. To mark a single coin as spent, we
have to update log2(n) inner nodes. We do get to amortise this a bit at the top
levels in the tree, but even if we assume the amortisation is totally free,
we're updating at least log2(n) - log2(m) inner nodes "under" the amortised
nodes at the top of the tree for *each* new node.
Meanwhile with an insertion-ordered TXO commitment, each new UTXO added to the
data set goes in the same place - the end. So almost none of the existing data
needs to be touched to add the new UTXOs. Equally, the hashing required for the
new UTXO's can be done in an incremental fashion that's very L1/L2 cache
friendly.
tl;dr: Precisely because access patterns in TXO commitments are *not* uniform,
I think we'll find that from a L1/L2/etc cache perspective alone, TXO
commitments will result in better performance than UTXO commitments.
Now it is true that Bitcoin's current design means we'll need a map of
confirmed outpoints to TXO insertion order indexes. But it's not particularly
hard to add that "metadata" to transactions on the P2P layer in the same way
that segwit added witnesses to transactions without modifying how txids were
calculated; if you only connect to peers who provide you with TXO index
information in blocks and transactions, you don't need to keep that map
yourself.
Finally, note how this makes transactions *smaller* in many circumstances: it's
just a 8-byte max index rather than a 40 byte outpoint.
--

Peter Todd via bitcoin-dev

2017-02-25 04:12:02 UTC

Permalink

Post by Bram Cohen via bitcoin-dev
So your idea is to cluster entries by entry time because newer things are
more likely to leave and updating multiple things near each other is
cheaper?

Yes, exactly.

Post by Bram Cohen via bitcoin-dev
That can be done with my tool. Instead of using hashes for the values being
stored, you use position entries. The first entry gets a value of all
zeros, the next one a one followed by all zeros, then the next two
correspond to the first two with the second bit flipped to one, then the
next four the first four with the third bit flipped to one, etc. It
probably performs a little bit better to do it two bits at a time instead
of one so that the entries are 00, 01, 10, 11, 0001, 0010, 0011, 0101,
0110, 0111, 1001, etc. If you were to really use this you'd probably want
to to add some optimizations to use the fact that the terminals fit in 64
bits instead of 256, but it mostly works unchanged, and gets whatever

So to be clear, what you're proposing there is to use the insertion order as
the index - once you go that far you've almost entirely re-invented my
proposal!

In fact, when I was working my proofchains/proofmarshal libraries I put some
thought into whether or not I could leave out the MMR merkelized list
implementation and use only the key-value map I also wrote. I decided to
include both as they aren't quite the same datastructure - using a list for a
list has advantages.

Post by Bram Cohen via bitcoin-dev
benefits there are to this clustering plus the high performance
implementation tricks I've built which I keep complaining that nobody's
giving feedback on.

Your merkle-set implementation is 1500 lines of densely written Python with
almost no comments, and less than a 100 lines of (also uncommented) tests. By
comparison, my Python MMR implementation is 300 lines of very readable Python
with lots of comments, a 200 line explanation at the top, and 200 lines of
(commented) tests. Yet no-one is taking the (still considerable) effort to
understand and comment on my implementation. :)

Fact is, what you've written is really daunting to review, and given it's not
in the final language anyway, it's unclear what basis to review it on anyway. I
suspect you'd get more feedback if the codebase was better commented, in a
production language, and you have actual real-world benchmarks and performance
figures.

In particular, while at the top of merkle_set.py you have a list of advantages,
and a bunch of TODO's, you don't explain *why* the code has any of these
advantages. To figure that out, I'd have to read and understand 1500 lines of
densely written Python. Without a human-readable pitch, not many people are
going to do that, myself included.

Post by Bram Cohen via bitcoin-dev
I'm not sold on this being a win: The empirical access patterns are
unknown,

Lost coins alone guarantees that access patterns will be biased towards new
coins being more likely to be spent. That basis alone is sufficient to justify
an insertion-ordered data structure. Additionally, people have done graphs of
the average age of UTXO's when spent, and that data clearly shows that newer
coins are more likely to be spent than older coins.

Post by Bram Cohen via bitcoin-dev
unknown, it requires an extra cache miss per lookup to find the entry
number,

Like I mentioned in the email you're replying to, that extra lookup can be
easily avoided with a change to how transactions/blocks are serialized; if all
your peers support TXO commitments you can even discard the lookup database
entirely, as it's only a backwards compatibility measure.

Post by Bram Cohen via bitcoin-dev
it may be that everything is optimized well enough without it for
there to be no meaningful gains, and it's a bunch of extra complexity. What

Optimization is itself extra complexity. If you're data structure has worse
inherent performance, and you have to make up the different with a highly
optimized implementation, that's likely to lead to more overall complexity than
using a data structure with better inherent performance.

Your current merkle-set implementation definitely _is_ very complex. An
apples-to-apples comparison is with my merkelized key:value tree(1), also a
patricia tree, which like the MMR is only about 300 lines of well-commented and
straight-forward code.

1) https://github.com/proofchains/python-proofmarshal/blob/master/proofmarshal/merbinnertree.py

Post by Bram Cohen via bitcoin-dev
should be done is that a plain vanilla UTXO set solution is optimized as
well as it can be first, and then the insertion ordering trick is tried as
an optimization to see if it's an improvement. Without that baseline
there's no meaningful basis for comparison, and I'm quite confident that a
naive implementation which just allocates individual nodes will
underperform the thing I've come up with, even without adding optimizations
related to fitting in 64 bits.

To be clear, "insertion ordering" isn't a simple trick, it's a fundamental
change to what the data structure is. Once you do that, you're talking about my
proposal.

--
https://petertodd.org 'peter'[:-1]@petertodd.org

Bram Cohen via bitcoin-dev

2017-02-25 06:23:20 UTC

Permalink

Post by Peter Todd via bitcoin-dev
So to be clear, what you're proposing there is to use the insertion order as
the index - once you go that far you've almost entirely re-invented my
proposal!

I'm not 'proposing' this, I'm saying it could be done simply but I'm
skeptical of the utility. Probably the most compelling argument for it is
that the insertion indexed values are much smaller so they can be compacted
down a lot resulting in using less memory and more locality and fewer
hashes, but your implementation doesn't take advantage of that.

Post by Peter Todd via bitcoin-dev
Your merkle-set implementation is 1500 lines of densely written Python

The reference implementation which is included in those 1500 lines is less
than 300 lines and fairly straightforward. The non-reference implementation
always behaves semantically identically to the reference implementation, it
just does so faster and using less memory.

Post by Peter Todd via bitcoin-dev
with
almost no comments,

The comments at the top explain both the proof format and the in-memory
data structures very precisely. The whole codebase was reviewed by a
coworker of mine and comments were added explaining the subtleties which
tripped him up.

Post by Peter Todd via bitcoin-dev
and less than a 100 lines of (also uncommented) tests.

Those tests get 98% code coverage and extensively hit not only the lines of
code but the semantic edge cases as well. The lines which aren't hit are
convenience functions and error conditions of the parsing code for when
it's passed bad data.

Post by Peter Todd via bitcoin-dev
By
comparison, my Python MMR implementation is 300 lines of very readable Python
with lots of comments, a 200 line explanation at the top, and 200 lines of
(commented) tests. Yet no-one is taking the (still considerable) effort to
understand and comment on my implementation. :)

Given that maaku's Merkle prefix trees were shelved because of performance
problems despite being written in C and operating in basically the same way
as your code and my reference code, it's clear that non-optimized Python
won't be touching the bitcoin codebase any time soon.

Post by Peter Todd via bitcoin-dev
Fact is, what you've written is really daunting to review, and given it's not
in the final language anyway, it's unclear what basis to review it on
anyway.

It should reviewed based on semantic correctness and performance.
Performance can only be accurately and convincingly determined by porting
to C and optimizing it, which mostly involves experimenting with different
values for the two passed in magic numbers.

Post by Peter Todd via bitcoin-dev
I
suspect you'd get more feedback if the codebase was better commented, in a
production language, and you have actual real-world benchmarks and performance
figures.

Porting to C should be straightforward. Several people have already
expressed interest in doing so, and it's written in intentionally C-ish
Python, resulting in some rather odd idioms which is a bit part of why you
think it looks 'dense'. A lot of that weird offset math should be much more
readable in C because it's all structs and x.y notation can be used instead
of adding offsets.

Post by Peter Todd via bitcoin-dev
In particular, while at the top of merkle_set.py you have a list of advantages,
and a bunch of TODO's, you don't explain *why* the code has any of these
advantages. To figure that out, I'd have to read and understand 1500 lines of
densely written Python. Without a human-readable pitch, not many people are
going to do that, myself included.

G. Andrew Stone via bitcoin-dev

2017-02-28 16:43:29 UTC

Permalink

I can understand how Bram's transaction double sha256 hashed UTXO set
patricia trie allows a client to quickly validate inputs because the inputs
of a transaction are specified in the same manner. So to verify that an
input is unspent the client simply traverses the patricia trie.

It also makes sense that if transaction inputs were specified by a [block
height, tx index, output index] triple we'd have a much more size-efficient
transaction format. This format would make look up pretty simple in
Peter's pruned time-ordered TXO merkle mountain range, although you'd have
translate the triple to an index, which means we'd have to at a minimum
keep track of the number of TXOs in each block, and then probably do a
linear search starting from the location where the block's TXOs begin in
the MMR. (The ultimate option I guess is to specify transaction inputs by
a single number which is essentially the index of the TXO in a (never
actually created) insertion-ordered TXO array...)

But since transactions' prevouts are not specified by [block height, tx
index, output index] or by TXO index, I don't understand how an insertion
ordered TXO tree can result in efficient lookups. Can you help me
understand this?

On Sat, Feb 25, 2017 at 1:23 AM, Bram Cohen via bitcoin-dev <

Post by Bram Cohen via bitcoin-dev

Post by Peter Todd via bitcoin-dev
So to be clear, what you're proposing there is to use the insertion order as
the index - once you go that far you've almost entirely re-invented my
proposal!

Post by Peter Todd via bitcoin-dev
Your merkle-set implementation is 1500 lines of densely written Python

Post by Peter Todd via bitcoin-dev
with
almost no comments,

Post by Peter Todd via bitcoin-dev
and less than a 100 lines of (also uncommented) tests.

Those tests get 98% code coverage and extensively hit not only the lines
of code but the semantic edge cases as well. The lines which aren't hit are
convenience functions and error conditions of the parsing code for when
it's passed bad data.

Post by Peter Todd via bitcoin-dev
Fact is, what you've written is really daunting to review, and given it's not
in the final language anyway, it's unclear what basis to review it on
anyway.

It's all about cache coherence. When doing operations it pulls in a bunch
of things which are near each other in memory instead of jumping all over
the place. The improvements it gets should be much greater than the ones
gained from insertion ordering, although the two could be accretive.
_______________________________________________
bitcoin-dev mailing list
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev

Bram Cohen via bitcoin-dev

2017-02-28 23:10:16 UTC

Permalink

Post by G. Andrew Stone via bitcoin-dev
But since transactions' prevouts are not specified by [block height, tx
index, output index] or by TXO index, I don't understand how an insertion
ordered TXO tree can result in efficient lookups. Can you help me
understand this?

You have to have a lookup table going from prevouts to txo index. Lookups
on that are relatively fast because looking up things in a hashtable is a
single cache miss, while looking up things in a tree is logarithmic cache
misses.

The purported benefit of using txout is that because recent things are
spent much more than old things, there's a lot of clustering of updates. If
you update two things near each other they share the top branches of
updates in the tree, resulting in less hashing and cache misses. But since
everything is log scale I suspect such benefits are small. My guess is
transaction ordering has much larger potential from compression because you
cram information about lots of things into a single leaf node because they
have very small diffs from each other. That said, those benefits are also
smaller than and accretive to the simple implementation tricks I already
implemented which cause things near each other in the tree to be near each
other in memory.

Pieter Wuille via bitcoin-dev

2017-02-28 23:24:28 UTC

Permalink

On Feb 28, 2017 15:10, "Bram Cohen via bitcoin-dev" <

You have to have a lookup table going from prevouts to txo index. Lookups
on that are relatively fast because looking up things in a hashtable is a
single cache miss, while looking up things in a tree is logarithmic cache
misses.

I'm wondering if there is some confusion here.

Yes, someone needs to have a lookup table from prevouts to TXO tree
positions. But because an insertion-ordered TXO tree does not rebalance,
that table can be maintained by wallets or service providers for just their
own coins, instead of by every full node and miner individually for
everyone's coins.

In the simplest committed TXO model, full nodes simply maintain the TXO
root hash, and every transaction/block comes with a proof that its inputs
are in the TXO tree, and the necessary information to update the root after
spending the inputs and adding the outputs.

--
Pieter

Bram Cohen via bitcoin-dev

2017-03-01 01:47:30 UTC

Permalink

Post by Pieter Wuille via bitcoin-dev
Yes, someone needs to have a lookup table from prevouts to TXO tree
positions. But because an insertion-ordered TXO tree does not rebalance,
that table can be maintained by wallets or service providers for just their
own coins, instead of by every full node and miner individually for
everyone's coins.

That falls apart if you want to support proofs of non-spend, which is the
point of the whole exercise

Peter Todd via bitcoin-dev

2017-03-01 01:56:16 UTC

Permalink

Post by Bram Cohen via bitcoin-dev

That falls apart if you want to support proofs of non-spend, which is the
point of the whole exercise

Can you explain in more detail what you mean there?

--
https://petertodd.org 'peter'[:-1]@petertodd.org

Peter Todd via bitcoin-dev

2017-03-01 22:31:01 UTC

Permalink

Post by Bram Cohen via bitcoin-dev

Post by Peter Todd via bitcoin-dev
Your merkle-set implementation is 1500 lines of densely written Python

Great!

But do you see my point here? Even though I spent some time reading through
that code, I didn't realise you had a 300 line reference implementation
embedded within those 1500 lines. This makes it less likely for you to get any
review on either.

A better way to present your work would have been to at least explain that at
the top of the file, and perhaps even better, split the reference
implementation and optimized implementation into two separate files. If you did
this, you'd be more likely to get others to review your work.

Post by Bram Cohen via bitcoin-dev

Post by Peter Todd via bitcoin-dev
with
almost no comments,

Yes, and it's good that you have those comments. But the codebase itself could
definitely use more, and adding those comments would help get more people
reviewing your work.

Post by Bram Cohen via bitcoin-dev

Post by Peter Todd via bitcoin-dev
and less than a 100 lines of (also uncommented) tests.

Great! But you see how without comments, it'll take a tremendous amount of work
for an external reviewer like myself to determine what is being tested, and
what edge cases you're targeting.

In fact, I'd suggest that for things like edge cases, you test edge cases in
separate unit tests that explain what edge cases you're trying to catch.

Post by Bram Cohen via bitcoin-dev

To be clear, I gave my implementation as an example of how hard it is to get
external review, not to suggest it's going to be a part of Bitcoin; I've
pointed a lot of people to it when they asked for a MMR implementation, and I'm
sure if some of those people had reviewed it carefully they would have
suggested changes. Yet they haven't, because doing good review is a lot of
work!

Post by Bram Cohen via bitcoin-dev

That's good, but that paragraph should be part of your MerkleSet git repo,
preferably in the README, where reviewers will immediately find it and get
excited about reviewing your code.

--
https://petertodd.org 'peter'[:-1]@petertodd.org

Bram Cohen via bitcoin-dev

2017-03-31 20:38:16 UTC

Permalink

Post by Peter Todd via bitcoin-dev
A better way to present your work would have been to at least explain that at
the top of the file, and perhaps even better, split the reference
implementation and optimized implementation into two separate files. If you did
this, you'd be more likely to get others to review your work.

I've now added explanation to the README, reorganized the files, and added
some comments:

https://github.com/bramcohen/MerkleSet

In fact, I'd suggest that for things like edge cases, you test edge cases in

Post by Peter Todd via bitcoin-dev
separate unit tests that explain what edge cases you're trying to catch.

The tests work by doing a lot of exercising on pseudorandom data, an
approach which does a good job of hitting all the lines of code and edge
cases and requiring very little updating as the implementation changes, at
the expense of it taking a while for tests to run. The advantage of very
custom unit tests is that they run almost instantly, at the cost of
requiring painstaking maintenance and missing more stuff. I've come to
favor this approach in my old age.

The proportion of code devoted to tests is more than it looks like at first
blush, because all the audit methods are just for testing.

praxeology_guy via bitcoin-dev

2017-04-01 10:18:12 UTC

Permalink

Peter Todd,

This MMR structure looks good to me. I really like how wallets keep their MMR proof and txo index instead of requiring the entire network to maintain an index on txids w/ plain old utxo snapshots.

Re: "only left or right child of inner node be a fully spent node"... that sounds fine to me, but the software should virtually consider that the previous dissapearing leaf nodes still exist. It would instead say be a special case handled by the meta hashing function. Would save a good amount of time from unneccesary hashing. Might also do the rule: if a parent node has a single fully spent child node, its hash is equal to its other child's hash.

Below is questions about txo/utxo MMR commitments after reading: "https://petertodd.org/2016/delayed-txo-commitments".

I'm mainly concerned about the performance of recalculating all of the node hashes on old spends. But probably with a long enough delay policy, it shouldn't be an issue.

Then the issues with people keeping their MMR proofs up to date and discovering received txos before they get pruned. Sure would be nice if a wallet didn't have to keep on updating their MMR proof. Hopefully spends would refer to old txos by their MMR index.

How are you ordering MMR additions? Are you only adding old utxos to the MMR? Or every old txo? I think you are doing all old txos (mostly would be spent nodes), but why not just old utxos? Are you doing it to make MMR index = blockchain txo index, so such an index can be used in all TX inputs except for non-confirmed transactions? Potentially a tx could use a MMR.ix, allblock'stxo.ix (if we want to maintian that index), or tx.id & vout.ix depending on how old the tx is.

What is the process for removing old utxos from the utxo set, and placing them into the MMR? Are you going to keep height in the utxo db, and then iterate through the whole thing?

Are you still proposing a 1 year txo commitment delay? Do you have any data/preformance studies judging the cost of a larger utxo and longer delay vs smaller utxo and shorder delay? I would figure a longer delay would be better as long as the utxo doesn't get too big. Longer delay means less MMR maintainance.

Have you considered not making a new commitment on every block, instead maybe every 6*24 blocks or so? This could reduce the number of times the same nodes are re-hashed, while not having much impact on security.

What about re-orgs? You'd have to restore the previous leaf & inner nodes. You'd need both the txos and MMR proofs, right? It looks like you clear this info from the TXO journal when the delay duration threshold is met. I guess this info might also be stored with the block data, and could be recovered from there.

What are your current thoughts on block size weighting for MMR proofs? Just count the extra byte length that full nodes need to relay as simliar to SegWit witness data?

Cheers,
Praxeology Guy

praxeology_guy via bitcoin-dev

2017-04-01 19:46:12 UTC

Permalink

gmaxwell told me that most nodes would keep a full copy of the top of the MMR tree.

Here I am exploring how this could be policy-ized to solve two problems:
- MMR proofs change over time
- How to coordinate nodes to get them to keep different portions of the MMR, so that everyone can prune most of the structure, but the entire network still retains multiple copies of the full MMR.

Define deltaLeafHeight as the number of tree layers between a node and the leaves in the MMR data structure. We make it a policy that nodes are expected to have all nodes above deltaLeafHeight = DLH_REQUIRED, but that nodes are free to prune any nodes with a deltaLeafHeight < DLH_REQUIRED. Of course a node could prune at DLH_REQUIRED or higher, but what I am proposing is that messages and proofs by default would only include nodes at deltaLeafHeight < DLH_REQUIRED.

Given the above, If a wallet didn't want to be continuously concerned about updating their MMR proof for its coins, then for each coin:
- store the set of utxo digests that are children of the "root nearest" node that is at deltaLeafHeight = DLH_REQUIRED. Call such a set of utxo digests the "pruned relatives".
- Pruned relative count = 2^DLH_REQUIRED -1
- Guessing the spentness status of the pruned relatives would worst case take 2^(pruned relative count) guesses.
- in the case where the MMR holds all txos (not just utxos at addition time)... the wallet should also keep record of which of the pruned relatives were utxos.
- Any future information discovered about whether a pruned relative is spent would reduce the worst case guess count by a factor of 2.

As an example, in the case where DLH_REQUIRED = 3:
- pruned relative count = 7
- worst case spentness guess count = 128

Wallets storing the digests of pruned relatives could also help the entire network be able to discover otherwise lost portions of the MMR. If wallets stored not just the pruned relatives digests, but also their corresponding utxos, they could help other nodes find lost coins.

Cheers,
Praxeology Guy