[bitcoin-dev] Using a storage engine without UTXO-index

On 04/06/2017 03:12 PM, Tomas via bitcoin-dev wrote:

Hi Tomas,

This is the approach that genjix used in libbitcoin version2. With the
exception of de-linking (not deleted) in the case of reorgs, the
entire store is append only, implemented in a small set of memory
mapped files. The downsides to the approach are:

(1) higher than necessary storage space requirement due to storing the
indexing data required for correlate the spends, and

(2) higher than necessary validation complexity and cost in terms of
computing the spent-ness (including spender height) of an output.

His implementation used a hash table, so performance-wise it did quite
well and would theoretically outperform a tree, O(1) vs. O(log2(N)).

Post by Tomas via bitcoin-dev
This allows for much better concurrency, as not only blocks, but
also individual inputs can be verified fully in parallel.

I was successful in parallelizing input validation (across the inputs
of an unconfirmed tx and across the set of all inputs in a block)
using the v2 store. However, it is not the case that the spends
approach is necessary for concurrency.

To resolve the above two problems the version3 store does not use a
spends table/index. Nor does it store any table of UTXOs. Yet
validation is highly parallelized. Instead of additional indexes it
uses the tx hash table, augmented with 32 bits per output for spender
height. So there is a O(1) cost of finding the tx and a O(N) cost of
finding the spender height where N is the number of outputs in the tx.
But because the number of outputs in a tx is bounded (by block size)
this is constant time in the number of transactions.

This works out much faster than the spends table, and without the
storage cost or complexity disadvantages. It also scales with
available hardware, as the memory mapped files become in-memory hash
tables. For low memory machines we found it was important to implement
an opaque UTXO cache to limit paging, but for higher end systems zero
cache is optimal.

Post by Tomas via bitcoin-dev
I am sharing this not only to ask for your feedback, but also to
call for a clear separation of protocol and implementations: As
this solution, reversing the costs of outputs and inputs, seems to
have excellent performance characteristics (as shown in the test
results), updates to the protocol addressing the UTXO growth, might
not be worth considering *protocol improvements* and it might be
best to address these concerns as implementation details.

Tomas via bitcoin-dev

2017-04-07 00:17:47 UTC

Hi Eric,

Thanks, but I get the impression that the similarity is rather
superficial.

Post by Eric Voskuil via bitcoin-dev
(1) higher than necessary storage space requirement due to storing the
indexing data required for correlate the spends, and

Hmm. No. Spends are simply scanned in the spend-tree (full tree,
prunable, fully 5.6gb), or caught by the spend-index (bit index,
non-prunable, fully 180mb). Neither impose significant storage
requirements.

Post by Eric Voskuil via bitcoin-dev
2) higher than necessary validation complexity and cost in terms of
computing the spent-ness (including spender height) of an output.
With the exception of de-linking (not deleted) in the case of reorgs, the
entire store is append only, implemented in a small set of memory
mapped file

I guess this is the key difference. As the spend-tree stores the spend
information in a tree structure, no reorgs are required, and the
resulting code is actually much less complex.

Bitcrust simply scans the tree. Although earlier designs used a
skip-list, it turns out that accompanied by a spent-index lagging a few
blocks behind, raw scanning is faster then anything even though it needs
to scan ~5 blocks times ~4000 inputs before reaching the first
spent-index, the actual scan is highly cache efficient and little more
then a "REP SCASQ", reaching sub-microsecond per input on each core
*including* the lookup in the spend index.

Post by Eric Voskuil via bitcoin-dev
I don't follow this part, maybe you could clarify. A spends index
grows with the size of the spend set (forever) as it cannot be pruned,
which certainly exceeds the size of the UTXO set (unless nothing is
spent). The advantage is that you don't have to keep rewriting the
store when you use a spends set (because the store can be append only).

My point is, that the spend tree grows per *input* of a transaction
instead of per *output* of a transaction, because this is what is
scanned on order validation.

The spend tree can be pruned because the spend index (~200mb) catches
early spends.

Disregarding the baseload script validation, the peak load order
validation of bitcrust is more negatively effected by a transaction with
many inputs than by a transaction of many outputs.

I encourage you to check out the results at https://bitcrust.org

Regards,
Tomas

Post by Eric Voskuil via bitcoin-dev
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256
Hi Tomas,

This is the approach that genjix used in libbitcoin version2. With the
exception of de-linking (not deleted) in the case of reorgs, the
entire store is append only, implemented in a small set of memory
(1) higher than necessary storage space requirement due to storing the
indexing data required for correlate the spends, and
(2) higher than necessary validation complexity and cost in terms of
computing the spent-ness (including spender height) of an output.
His implementation used a hash table, so performance-wise it did quite
well and would theoretically outperform a tree, O(1) vs. O(log2(N)).

Post by Tomas via bitcoin-dev
This allows for much better concurrency, as not only blocks, but
also individual inputs can be verified fully in parallel.

I was successful in parallelizing input validation (across the inputs
of an unconfirmed tx and across the set of all inputs in a block)
using the v2 store. However, it is not the case that the spends
approach is necessary for concurrency.
To resolve the above two problems the version3 store does not use a
spends table/index. Nor does it store any table of UTXOs. Yet
validation is highly parallelized. Instead of additional indexes it
uses the tx hash table, augmented with 32 bits per output for spender
height. So there is a O(1) cost of finding the tx and a O(N) cost of
finding the spender height where N is the number of outputs in the tx.
But because the number of outputs in a tx is bounded (by block size)
this is constant time in the number of transactions.
This works out much faster than the spends table, and without the
storage cost or complexity disadvantages. It also scales with
available hardware, as the memory mapped files become in-memory hash
tables. For low memory machines we found it was important to implement
an opaque UTXO cache to limit paging, but for higher end systems zero
cache is optimal.

I don't follow this part, maybe you could clarify. A spends index
grows with the size of the spend set (forever) as it cannot be pruned,
which certainly exceeds the size of the UTXO set (unless nothing is
spent). The advantage is that you don't have to keep rewriting the
store when you use a spends set (because the store can be append only).
Feel free to message me if you'd like to discuss in more detail, or to
continue on the libbitcoin mailing list (copied).
e
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAEBCAAGBQJY5tFpAAoJEDzYwH8LXOFOcMgH/2mw5iOvUYNwvZ2z0KKTSUOA
Pd8d5mKoWvd94QxhQ+RyTbkEkMhHl75+zcBgRsfUTtZlBIe/Z0+OgVIN6ibEw+WD
w7k3HqgQi9gLgydEelxTAX+z3dJ24n4kCCdKAmZbBuK+Yr/7AViugbEqYemKepku
pRWZZS74MUvrYesc0xPn4Ao3DTzMjjY0K2mkuqV8jlwdfZjlAQX9pTx+iSCuMhkd
HJ8w7s8QnjVnUeOlLe29mZwaFJPyOTLJMqgDE6s2sXacAy5QQbVCatygvDQ8A/wC
ktBnKPFb2lGX3bGKu/KwABegBy/hyec+NP0wFR+0MVivCwTK1+SjeHu5MNOSVlM=
=tfVj
-----END PGP SIGNATURE-----

Eric Voskuil via bitcoin-dev

2017-04-08 22:37:50 UTC

Post by Tomas via bitcoin-dev
Thanks, but I get the impression that the similarity is rather
superficial.

My point was that "Using a storage engine without UTXO-index" has been
done, and may be a useful reference, not that implementation details
are the same.
Below you addressed two points I made regarding the downside of the
original libbitcoin implementation. These were initial learnings that
informed future implementations (also without a UTXO index). These
were not comparisons to your implementation.

Post by Eric Voskuil via bitcoin-dev
(1) higher than necessary storage space requirement due to
storing the indexing data required for correlate the spends, and

Post by Eric Voskuil via bitcoin-dev
2) higher than necessary validation complexity and cost in terms
of computing the spent-ness (including spender height) of an
output.
With the exception of de-linking (not deleted) in the case of
reorgs, the entire store is append only, implemented in a small
set of memory mapped file

I guess this is the key difference. As the spend-tree stores the
spend information in a tree structure, no reorgs are required, and
the resulting code is actually much less complex.

The references to "higher than necessary storage" and "higher than
necessary validation cost" are explicitly relative statements,
comparing earlier and later libbitcoin implementations.

It is not clear to me how you are relating both the storage cost
("Hmm. No. ... Neither impose significant storage requirements.") and
code complexity ("... resulting code is actually much less complex")
of your tx ordering software to my statements. Do you think I am wrong
and libbitcoin v3 is not actually more space and code efficient than
libbitcoin v2?

But given that you have thrown some numbers and ideas out in a request
for feedback, I'm happy to give you some based on several years of
experience working closely with these issues.

First, I remain confused on your comments pertaining to UTXO growth
and network protocol. I followed your conversation with Greg and it
remains unclear to me. From what I understand you have isolated order
(double spend) from script validation. I think we all understand that
script validation requires inputs and outputs while double spend
detection requires correlation of inputs. What I do not understand is
your choice of optimization axis.

Detection of double spend is not useful in isolation. One must also
validate scripts, which requires outputs. I can see that there is an
opportunity to reject blocks (within the same branch) faster by
validating for double spends before validating script. But unconfirmed
transactions do not exist in a branch, and are therefore not truly
conflicting, until they are mined. And even after they are mined
conflicting txs remain potentially valid in other branches. So
rejecting txs due to conflict comes down to a denial of service
policy, which ultimately must be based on fee increment (e.g. RBF).
But fees are based on the amount of the output value that remains
unspent in the transaction. So this in turn requires the retrieval of
outputs.

And yet the remaining scenario of fast rejection of invalid blocks is
not a meaningful optimization. Optimizing for the case where a block
has valid and sufficient PoW and yet is invalid (for double spend) is
counterproductive. And even so, the txs within the invalid block may
be entirely valid independent of the block, so you are back to looking
up their outputs to obtain fees in the case of a double spend or to
validate script otherwise. In all cases you need to get the outputs.

Post by Tomas via bitcoin-dev
Bitcrust simply scans the tree. Although earlier designs used a
skip-list, it turns out that accompanied by a spent-index lagging a
few blocks behind, raw scanning is faster then anything even though
it needs to scan ~5 blocks times ~4000 inputs before reaching the
first spent-index, the actual scan is highly cache efficient and
little more then a "REP SCASQ", reaching sub-microsecond per input
on each core *including* the lookup in the spend index.

I realize that you see the implementation of the ordering validation
as interesting detail, but I find it hard to justify contemplating the
implementation in isolation from the output lookup requirement. And if
one must looking up both outputs and spends for each validation, it
makes more sense to co-locate that data.

Recovering in one step all data necessary to validate a tx has real
advantages over either interleaving queries and validation or
splitting input vs. output validation queries into two steps. It is a
significantly more test-friendly approach, has better performance
characteristics, and simplifies code. I cannot see any reason to
perform the data read for double spend validation in isolation of that
for script validation.

Post by Eric Voskuil via bitcoin-dev
I don't follow this part, maybe you could clarify. A spends
index grows with the size of the spend set (forever) as it cannot
be pruned, which certainly exceeds the size of the UTXO set
(unless nothing is spent). The advantage is that you don't have
to keep rewriting the store when you use a spends set (because
the store can be append only).

My point is, that the spend tree grows per *input* of a
transaction instead of per *output* of a transaction, because this
is what is scanned on order validation.

I think the conversation with Greg resolved my questions in this area.
What I find interesting is the reliance on Core's UTXO store to
implement script validation. This is not, "a storage engine without a
UTXO-index" as it has a dependency on Core's UTXO index.

On the other hand the initial libbitcoin implementation that I
described to you is *actually* a bitcoin store with no UTXO index. The
current implementation is as well, however it is implemented
differently and is much more efficient than the original. How it
compares to your design is not really the point and impossible to
measure until you have production code.

I can say however that your assumptions about the storage (and
performance) superiority of the design, or at least its
implementation, seem unfounded. If you are storing more index data
(5.6gb) than 32 bits per output, you are using more space than
production implementations. As for complexity, I don't think you'll
get any simpler than a loop to populate spend heights from a hash
table and a loop to test their integer values.

Post by Tomas via bitcoin-dev
The spend tree can be pruned because the spend index (~200mb)
catches early spends.
Disregarding the baseload script validation, the peak load order
validation of bitcrust is more negatively effected by a transaction
with many inputs than by a transaction of many outputs.
I encourage you to check out the results at https://bitcrust.org

If by results you are referring to performance numbers, it's very hard
to draw any conclusions without a full benchmark. It's great that if
you are able to boost Core, but from my perspective the numbers aren't
especially compelling.

As for some of the site's comments, these again cause me to question
the optimization choices:

"Blocks can be verified in parallel..."

Despite the site's explanation I cannot think of any reason to ever
validate two blocks at the same time. You would always prioritize the
block with the greatest PoW. Doing otherwise just slows down the net
validation in all but the pathological case where a miner has produced
an *invalid* block with *more* PoW than another valid block which
arrived at the node within the same second. Rejecting a *valid* block
with more PoW in favor of one with *less* "processing" is a hard fork,
so you probably wouldn't want to do that either. But with compact
block validation times approaching 25ms it's hard to justify stopping
a block validation for any reason.

That's not to say parallel block validation difficult to do. If you
can validate one block's full set of inputs in parallel (which is not
novel) doing the same with additional blocks has trivial additional
complexity.

"The storage engine is optimized from ground up for
xthin/compact-block synchronization. This ensures that when the
majority of transactions are already synced, incoming blocks can be
verified at minimal resources using order-validation only."

There are two distinct considerations here. One is pre-validation of
txs and the other is compact announcements. Just to be clear, the
former does not require the latter. Libbitcoin for example fully
exploits the former, independent of compactness. With a low min fee
setting and a few peers it is typical for the node to have
pre-validated 100% of non-coinbase txs. Averages at 1 satoshi per byte
are about 99.9%, effectively amortizing all script validation cost. So
this optimization is neither novel nor limited to compactness (which
is about reducing latency).

I am also interested in your previous comments about soft forks. These
are material considerations that Greg touched on but it doesn't sound
like you fully appreciate just yet. When a tx is pre-validated the
rules applied must be the same rules as those of some future block.
Yet a tx can be included in more than one block (different branches).
Across branches and even in one branch, validation rules change, and
can change back. The changes are based on accumulated branch history.
Pre-validation can later become invalidated, and differently in
different branches. And maintaining proper context requires either
storing state that you are apparently not storing, or invalidating
optimizations. Based on your comments you do not seem to be accounting
for this in your storage assumptions or in your results. A recent post
by Greg highlights the complexity and consensus criticality of these
considerations.

By "order-validation only" I believe you are referring to a
determination of whether the txs organized into a candidate block
double spend internal to the block or in the ancestry. Assuming that
one recovers outputs at the same time (and presumably from the same
location) as spender height (which is required both for validating
spends of a coinbase and for determination of whether the spend is
above the fork point), this determination is straightforward. One
simply loops over the spender records and invalidates a tx that has a
spender height not above the fork point (while also validating
coinbase maturity using the same height). A loop over the set of
in-memory spend heights of each output a tx is certainly fast enough
to not be worthy of any further optimization. And as previously
discussed, the population of the spender heights is not even a
material additional cost over obtaining the (necessary) output scripts.

The hash table store that I described can fully navigate the block
tree and transaction DAG, since the stored tx, parent and point hashes
are also natural keys and each link is navigable in constant time. It
is also lock-free, can concurrently write any number of blocks during
initial block download and supports read/write concurrency. It has
successfully indexed and stored the entire blockchain from the P2P
network in 16 minutes (locally). It also stores both confirmed and
unconfirmed transactions in the same store, so there is nothing to
write when a block is confirmed except for the block header/hashes and
updates to spender heights for any output spent by the new block's
txs. It is similarly capable of storage in the block table of weak
chain blocks...

But one thing it does *not* do is maintain spender and fork state for
multiple branches. In other words it is optimized for one long chain,
not multiple long branches. Your approach has a limited (in terms of
double spend identification) optimization for reorganization (i.e. a
change to the strong chain identity). However, applying that
optimization to the full store and supportive of soft forks, as
opposed to just input ordering, is a much larger task than it appears
you have attempted. I know, as I created a design for that approach
and after some time scrapped it. The cost of performing the
reorganization in the above store is low enough and very long reorgs
infrequent enough, for the optimization to be counterproductive. It's
elegant in theory, but in practice it increases storage requirements,
impacts general performance and significantly increases complexity.
Bitcoin's data model pushes one away from a tree design in that it is
always pruning the tree. Having the tree is necessary, but it's not
something to optimize for.

e

Post by Tomas via bitcoin-dev
Regards, Tomas
On Fri, Apr 7, 2017, at 01:38, Eric Voskuil wrote: On 04/06/2017
Hi Tomas,

This is the approach that genjix used in libbitcoin version2. With
the exception of de-linking (not deleted) in the case of reorgs,
the entire store is append only, implemented in a small set of
(1) higher than necessary storage space requirement due to storing
the indexing data required for correlate the spends, and
(2) higher than necessary validation complexity and cost in terms
of computing the spent-ness (including spender height) of an
output.
His implementation used a hash table, so performance-wise it did
quite well and would theoretically outperform a tree, O(1) vs.
O(log2(N)).

Post by Tomas via bitcoin-dev
This allows for much better concurrency, as not only blocks,
but also individual inputs can be verified fully in
parallel.

I was successful in parallelizing input validation (across the
inputs of an unconfirmed tx and across the set of all inputs in a
block) using the v2 store. However, it is not the case that the
spends approach is necessary for concurrency.
To resolve the above two problems the version3 store does not use
a spends table/index. Nor does it store any table of UTXOs. Yet
validation is highly parallelized. Instead of additional indexes
it uses the tx hash table, augmented with 32 bits per output for
spender height. So there is a O(1) cost of finding the tx and a
O(N) cost of finding the spender height where N is the number of
outputs in the tx. But because the number of outputs in a tx is
bounded (by block size) this is constant time in the number of
transactions.
This works out much faster than the spends table, and without the
storage cost or complexity disadvantages. It also scales with
available hardware, as the memory mapped files become in-memory
hash tables. For low memory machines we found it was important to
implement an opaque UTXO cache to limit paging, but for higher end
systems zero cache is optimal.

Post by Tomas via bitcoin-dev
I am sharing this not only to ask for your feedback, but also
to call for a clear separation of protocol and
implementations: As this solution, reversing the costs of
outputs and inputs, seems to have excellent performance
characteristics (as shown in the test results), updates to
the protocol addressing the UTXO growth, might not be worth
considering *protocol improvements* and it might be best to
address these concerns as implementation details.

Tomas via bitcoin-dev

2017-04-08 23:58:02 UTC

Thank you for your elaborate response Eric,

Post by Eric Voskuil via bitcoin-dev
My point was that "Using a storage engine without UTXO-index" has been
done, and may be a useful reference, not that implementation details
are the same.

I haven't dived into libbitcoin V2/V3 enough to fully grasp it and
though your comments help, I still not fully do. I will answer below
what is related to bitcrust itself.

My post wasn't posted to claim innovation; I merely try to explain how
Bitcrust works and why it performs well.

Post by Eric Voskuil via bitcoin-dev
First, I remain confused on your comments pertaining to UTXO growth
and network protocol. I followed your conversation with Greg and it
remains unclear to me. From what I understand you have isolated order
(double spend) from script validation. I think we all understand that
script validation requires inputs and outputs while double spend
detection requires correlation of inputs. What I do not understand is
your choice of optimization axis.
Detection of double spend is not useful in isolation. One must also
validate scripts, which requires outputs. I can see that there is an
opportunity to reject blocks (within the same branch) faster by
validating for double spends before validating script. But unconfirmed
transactions do not exist in a branch, and are therefore not truly
conflicting, until they are mined. And even after they are mined
conflicting txs remain potentially valid in other branches. So
rejecting txs due to conflict comes down to a denial of service
policy, which ultimately must be based on fee increment (e.g. RBF).
But fees are based on the amount of the output value that remains
unspent in the transaction. So this in turn requires the retrieval of
outputs.
And yet the remaining scenario of fast rejection of invalid blocks is
not a meaningful optimization. Optimizing for the case where a block
has valid and sufficient PoW and yet is invalid (for double spend) is
counterproductive. And even so, the txs within the invalid block may
be entirely valid independent of the block, so you are back to looking
up their outputs to obtain fees in the case of a double spend or to
validate script otherwise. In all cases you need to get the outputs.

I realize that you see the implementation of the ordering validation
as interesting detail, but I find it hard to justify contemplating the
implementation in isolation from the output lookup requirement. And if
one must looking up both outputs and spends for each validation, it
makes more sense to co-locate that data.
Recovering in one step all data necessary to validate a tx has real
advantages over either interleaving queries and validation or
splitting input vs. output validation queries into two steps. It is a
significantly more test-friendly approach, has better performance
characteristics, and simplifies code. I cannot see any reason to
perform the data read for double spend validation in isolation of that
for script validation.

You seem to ignore here the difference between base load and peak load.
If Compact blocks/XThin with further optimizations can presync nearly
100% of the transactions, and nodes can do as much as possible when a
transaction comes in, the time spent when a block comes in can be
minimized and a lot more transactions can be handled with the same
resources.

The reason for "splitting" is that for an incoming transaction the
spent-state of the outputs being spent isn't particularly relevant as
you seem to acknowledge. When the block comes in, the actual output data
isn't relevant.

The *only* thing that needs to be checked when a block comes in is the
order, and the spend-tree approach absolves the need to access outputs
here.

As it also absolves the need for reorgs this greatly simplifies the
design. I am not sure why you say that a one-step approach is more
"test-friendly" as this seems to be unrelated.

Post by Eric Voskuil via bitcoin-dev
If by results you are referring to performance numbers, it's very hard
to draw any conclusions without a full benchmark. It's great that if
you are able to boost Core, but from my perspective the numbers aren't
especially compelling.

I fully agree and hopefully do not pretend to hide that my numbers are
premature without a full implementation. I just think they are promising
enough to convince at least myself to move on with this model.

Post by Eric Voskuil via bitcoin-dev
Despite the site's explanation I cannot think of any reason to ever
validate two blocks at the same time. You would always prioritize the
block with the greatest PoW. Doing otherwise just slows down the net
validation in all but the pathological case where a miner has produced
an *invalid* block with *more* PoW than another valid block which
arrived at the node within the same second. Rejecting a *valid* block
with more PoW in favor of one with *less* "processing" is a hard fork,
so you probably wouldn't want to do that either. But with compact
block validation times approaching 25ms it's hard to justify stopping
a block validation for any reason.

I don't get what you are saying. Why pick the greatest PoW of two
competing blocks? If two blocks come in, an implementation is free to
choose whichever block to build on. Choosing so is not a "hardfork".
Parallel validation simply makes it easier to make an optimal choice,
for if two blocks come in, the one that is validated fastest can be
build upon without the risk of validationless mining.

Post by Eric Voskuil via bitcoin-dev
That's not to say parallel block validation difficult to do. If you
can validate one block's full set of inputs in parallel (which is not
novel) doing the same with additional blocks has trivial additional
complexity.

I am not trying to claim novelty here.

Post by Eric Voskuil via bitcoin-dev
I am also interested in your previous comments about soft forks. These
are material considerations that Greg touched on but it doesn't sound
like you fully appreciate just yet. When a tx is pre-validated the
rules applied must be the same rules as those of some future block.
Yet a tx can be included in more than one block (different branches).
Across branches and even in one branch, validation rules change, and
can change back. The changes are based on accumulated branch history.
Pre-validation can later become invalidated, and differently in
different branches. And maintaining proper context requires either
storing state that you are apparently not storing, or invalidating
optimizations. Based on your comments you do not seem to be accounting
for this in your storage assumptions or in your results. A recent post
by Greg highlights the complexity and consensus criticality of these
considerations.

Frankly, I think this is a bit of an exaggeration. Soft forks are
counted on a hand, and I don't think there are many - if any -
transactions in the current chain that have changed compliance based on
height. This makes this a compliance issue and not a performance issue
and the solution I have explained, to add height-based compliance as
meta data of validation seems to
be adequate and safe.

Post by Eric Voskuil via bitcoin-dev
The hash table store that I described can fully navigate the block
tree and transaction DAG, since the stored tx, parent and point hashes
are also natural keys and each link is navigable in constant time. It
is also lock-free, can concurrently write any number of blocks during
initial block download and supports read/write concurrency. It has
successfully indexed and stored the entire blockchain from the P2P
network in 16 minutes (locally). It also stores both confirmed and
unconfirmed transactions in the same store, so there is nothing to
write when a block is confirmed except for the block header/hashes and
updates to spender heights for any output spent by the new block's
txs. It is similarly capable of storage in the block table of weak
chain blocks...

I think I get the gist of your approach and it sounds very interesting
and I will definitely dive in deeper.

It also seems sufficiently different from Bitcrust to merit competing on
(eventual) results instead of the complicated theory alone.

Best,
Tomas

Eric Voskuil via bitcoin-dev

2017-04-11 01:44:57 UTC

Post by Tomas via bitcoin-dev
You seem to ignore here the difference between base load and peak
load. If Compact blocks/XThin with further optimizations can
presync nearly 100% of the transactions, and nodes can do as much
as possible when a transaction comes in, the time spent when a
block comes in can be minimized and a lot more transactions can be
handled with the same resources.

Maybe it's an issue of terminology. I have never used the terms
base/peak load. However I've been trying to get across, poorly I
suppose, that this is actually implemented in libbitcoin. I generally
refer to it as tx pre-validation. I've also tried to relate that you
are unnecessarily relating pre-validation to compactness. These are
unrelated ideas and better considered independently. One can get
nearly all of the benefit of pre-validation while still receiving
blocks (vs. compact blocks). The advantage of compactness is reduced
latency of the block announcement. The reason for pre-validation is
amortization of the validation and/or storage cost of a block.

Post by Tomas via bitcoin-dev
The reason for "splitting" is that for an incoming transaction the
spent-state of the outputs being spent isn't particularly
relevant as you seem to acknowledge. When the block comes in, the
actual output data isn't relevant.

As I understand it you would split tx inputs and outputs and send them
independently, and that you intend this to be a P2P network
optimization - not a consensus rule change. So my comments are based
on those inferences. If we are talking about consensus changes this
conversation will end up in an entirely different place.

I don't agree with the input/output relevance statements above. When a
tx is announced the entire tx is relevant. It cannot be validated as
outputs only. If it cannot be validated it cannot be stored by the
node. Validating the outputs only would require the node store invalid
transactions.

I do accept that a double-spend detection is not an optimal criteria
by which to discard a tx. One also needs fee information. But without
double-spend knowledge the node has no rational way to defend itself
against an infinity of transactions that spend the minimal fee but
also have conflicting inputs (i.e. risking the fee only once). So tx
(pool) validation requires double-spend knowledge and at least a
summary from outputs.

Post by Tomas via bitcoin-dev
The *only* thing that needs to be checked when a block comes in is
the order, and the spend-tree approach absolves the need to access
outputs here.

Inputs that are already valid against prevouts remain valid assuming
consensus rules have not changed. But any input that spends a coinbase
must be validated for prevout height once there is a block context for
validation. Additionally the set of txs must be validated for total
size, sigops, and fee claim. So it's not true that conflict detection
alone is sufficient. Yet one can cache a tx's size, sigops, fee and
minimum height in a graph so that when a block appears that contains
that tx the input validation can be skipped.

Ignoring the (actual) requirement for the full tx on the pool
validation, the required "order" validation at (compact or other)
block arrival basically consists of traversing each tx, ensuring none
are confirmed in a block below the fork point; traversing each each of
its confirmed inputs, ensuring that none are spent in a block below
the fork point; and ensuring the block's set of transactions do not
contain missing inputs and do not double spend internal to the block.

This and the above-mentioned other required per-transaction block
validation data can be cached to an in-memory structure as a potential
optimization over navigating the store, and as you say, does not
therefore require the actual outputs (script/value). But the original
issue of needing full transactions for independent transaction
validation remains.

Post by Tomas via bitcoin-dev
As it also absolves the need for reorgs this greatly simplifies the
design.

A reorg is conceptual and cannot be engineered out. What you are
referring to is a restructuring of stored information as a consequence
of a reorg. I don't see this as related to the above. The ability to
perform reorganization via a branch pointer swap is based not on the
order or factoring of validation but instead on the amount of
information stored. It requires more information to maintain multiple
branches.

Transactions have confirmation states, validation contexts and spender
heights for potentially each branch of an unbounded number of
branches. It is this requirement to maintain that state for each
branch that makes this design goal a very costly trade-off of space
and complexity for reorg speed. As I mentioned earlier, it's the
optimization for this scenario that I find questionable.

Post by Tomas via bitcoin-dev
I am not sure why you say that a one-step approach is more
"test-friendly" as this seems to be unrelated.

Full separation of concerns allows all validation to be performed in
isolation from the store. As such validation state can be faked and
provided to a tx, block or chain, for the purpose of test. Validation
that interacts with a complex store during validation is harder to
fake and tests can be hard to verify.

It's not really the "one-step" approach that make this possible. In
fact that's not an accurate description. Validation and storage of txs
and blocks consists of four steps:

(1) context free
(2) contextual (chain-based)
(3) expensive (script eval)
(4) storage and notification

So we have:

tx.check()
tx.accept(state)
tx.connect(state)
chain.organize(tx)

block.check()
block.accept(state)
block.connect(state)
chain.organize(block)

...where "chain" is the store, from which "state" is derived. The
state for an unconfirmed tx is based on the presumption that the tx
would be mined in the next block. If that is not the case then its
pre-validation can become invalidated. So from my perspective, this
discussion is all about populating state. Anything that cannot be
placed into that pattern would complicate both the conceptual model
and testing. We've also seen that this isolation also has performance
advantages, as it facilitates optimizations that are otherwise
challenging.

Post by Eric Voskuil via bitcoin-dev
Despite the site's explanation I cannot think of any reason to
ever validate two blocks at the same time. You would always
prioritize the block with the greatest PoW. Doing otherwise just
slows down the net validation in all but the pathological case
where a miner has produced an *invalid* block with *more* PoW
than another valid block which arrived at the node within the
same second. Rejecting a *valid* block with more PoW in favor of
one with *less* "processing" is a hard fork, so you probably
wouldn't want to do that either. But with compact block
validation times approaching 25ms it's hard to justify stopping
a block validation for any reason.

I don't get what you are saying. Why pick the greatest PoW of two
competing blocks?

Because choosing the lesser amount of work is non-consensus behavior.
Under the same circumstances (i.e. having seen the same set of blocks)
two nodes will disagree on whether there is one confirmation or no
confirmations for a given tx. This disagreement will persist (i.e. why
take the weaker block only to turn around and replace it with the
stronger block that arrives a few seconds or minutes later). It stands
to reason that if one rejects a stronger block under a race condition,
one would reorg out a stronger block when a weaker block arrives a
little after the stronger block. Does this "optimization" then apply
to chains of blocks too?

Post by Tomas via bitcoin-dev
If two blocks come in, an implementation is free to choose
whichever block to build on.

Implementations are free to choose no blocks. That's not really the issu
e.

Post by Tomas via bitcoin-dev
Choosing so is not a "hardfork".

Accepting a block that all previous implementations would have
rejected under the same circumstance could be considered a hard fork,
but you may be right.

Yet the classification is not essential to my point. Nor is any
material change required to validate blocks in parallel. We can do it
using current design, but it doesn't make sense to do so.

Post by Tomas via bitcoin-dev
Parallel validation simply makes it easier to make an optimal
choice, for if two blocks come in, the one that is validated
fastest can be build upon without the risk of validationless
mining.

This is not an optimization, since it should always be optimal to
validate blocks independently. Performing multiple together inherently
slows both of them. And the advantage to not validating *either* would
remain.

Post by Eric Voskuil via bitcoin-dev
I am also interested in your previous comments about soft forks.
These are material considerations that Greg touched on but it
doesn't sound like you fully appreciate just yet. When a tx is
pre-validated the rules applied must be the same rules as those
of some future block. Yet a tx can be included in more than one
block (different branches). Across branches and even in one
branch, validation rules change, and can change back. The
changes are based on accumulated branch history. Pre-validation
can later become invalidated, and differently in different
branches. And maintaining proper context requires either storing
state that you are apparently not storing, or invalidating
optimizations. Based on your comments you do not seem to be
accounting for this in your storage assumptions or in your
results. A recent post by Greg highlights the complexity and
consensus criticality of these considerations.

Hope is a bug.

Post by Tomas via bitcoin-dev
This makes this a compliance issue and not a performance issue

You cannot have a useful performance measure without full compliance.

Post by Tomas via bitcoin-dev
and the solution I have explained, to add height-based compliance
as meta data of validation seems to be adequate and safe.

If you intend this to be useful it has to help build the chain, not
just rely on hardwiring checkpoints once rule changes are presumed to
be buried deeply enough to do so (as the result of other implementations
).

I understand this approach, it was ours at one time. There is a
significant difference, and your design is to some degree based on a
failure to fully consider this. I encourage you to not assume any
consensus-related detail is too small.

Post by Eric Voskuil via bitcoin-dev
The hash table store that I described can fully navigate the
block tree and transaction DAG, since the stored tx, parent and
point hashes are also natural keys and each link is navigable in
constant time. It is also lock-free, can concurrently write any
number of blocks during initial block download and supports
read/write concurrency. It has successfully indexed and stored
the entire blockchain from the P2P network in 16 minutes
(locally). It also stores both confirmed and unconfirmed
transactions in the same store, so there is nothing to write
when a block is confirmed except for the block header/hashes and
updates to spender heights for any output spent by the new
block's txs. It is similarly capable of storage in the block
table of weak chain blocks...

I think I get the gist of your approach and it sounds very
interesting and I will definitely dive in deeper.

It's worth noting that many of your stated objectives, including
modularity, developer platform, store isolation, consensus rule
isolation (including optional use of libbitcoinconsensus) are implemente
d.

It seems like you are doing some good work and it's not my intent to
discourage that. Libbitcoin is open source, I don't get paid and I'm
not selling anything. But if you are going down this path you should
be aware of it and may benefit from our successes as well as some of
the other stuff :). And hopefully we can get the benefit of your
insights as well.

e

Tomas via bitcoin-dev

2017-04-11 08:43:30 UTC

Post by Eric Voskuil via bitcoin-dev
As I understand it you would split tx inputs and outputs and send them
independently, and that you intend this to be a P2P network
optimization - not a consensus rule change. So my comments are based
on those inferences. If we are talking about consensus changes this
conversation will end up in an entirely different place.
I don't agree with the input/output relevance statements above. When a
tx is announced the entire tx is relevant. It cannot be validated as
outputs only. If it cannot be validated it cannot be stored by the
node. Validating the outputs only would require the node store invalid
transactions.

Splitting transactions only happens *on storage* and is just a minor
optimization compared to storing them in full. (actually a very recent
change with only marginally better results). This is simply because the
output scripts are read on script validation, and storing the outputs of
the transaction separately ensures better spatial locality of reference
(the inputs are just "in the way"). This is not relevant when using a
UTXO-index, because the outputs are then directly stored in the index,
where bitcrust has to read them from the transaction data.

It is not my intention to send them independently.

Post by Eric Voskuil via bitcoin-dev
I do accept that a double-spend detection is not an optimal criteria
by which to discard a tx. One also needs fee information. But without
double-spend knowledge the node has no rational way to defend itself
against an infinity of transactions that spend the minimal fee but
also have conflicting inputs (i.e. risking the fee only once). So tx
(pool) validation requires double-spend knowledge and at least a
summary from outputs.

Double spent information is still available to the network node and
could still be used for DoS protection, although I do believe
alternatives may exist.

Post by Eric Voskuil via bitcoin-dev
A reorg is conceptual and cannot be engineered out. What you are
referring to is a restructuring of stored information as a consequence
of a reorg. I don't see this as related to the above. The ability to
perform reorganization via a branch pointer swap is based not on the
order or factoring of validation but instead on the amount of
information stored. It requires more information to maintain multiple
branches.
Transactions have confirmation states, validation contexts and spender
heights for potentially each branch of an unbounded number of
branches. It is this requirement to maintain that state for each
branch that makes this design goal a very costly trade-off of space
and complexity for reorg speed. As I mentioned earlier, it's the
optimization for this scenario that I find questionable.

Sure, we can still call switching tips a "reorg". And it is indeed a
trade off as orphan blocks are stored, but a block in the spend tree
takes only ~12kb and contains the required state information.

I believe this trade off reduced complexity. For the earlier tree this
could be pruned.

Post by Eric Voskuil via bitcoin-dev
Because choosing the lesser amount of work is non-consensus behavior.
Under the same circumstances (i.e. having seen the same set of blocks)
two nodes will disagree on whether there is one confirmation or no
confirmations for a given tx. This disagreement will persist (i.e. why
take the weaker block only to turn around and replace it with the
stronger block that arrives a few seconds or minutes later). It stands
to reason that if one rejects a stronger block under a race condition,
one would reorg out a stronger block when a weaker block arrives a
little after the stronger block. Does this "optimization" then apply
to chains of blocks too?

The blockchain is - by design - only eventually consistent across nodes.
Even if nodes would use the same "tip-selection" rules, you cannot rely
on all blocks being propagated and hence each transaction having the
same number of confirmations across all nodes.

As a simpler example, if two miners both mine a block at approximately
the same time and send it to each other, then surely they would want to
continue mining on their own block. Otherwise they would be throwing
away their own reward.

And yes, this can also happen over multiple blocks, but the chances of
consistency are vastly increased with each confirmation.

Post by Eric Voskuil via bitcoin-dev
Accepting a block that all previous implementations would have
rejected under the same circumstance could be considered a hard fork,
but you may be right.

I am not talking about rejecting blocks, I am only talking choosing on
which tip to mine.

Post by Tomas via bitcoin-dev
Frankly, I think this is a bit of an exaggeration. Soft forks are
counted on a hand, and I don't think there are many - if any -
transactions in the current chain that have changed compliance
based on height.

Hope is a bug.
If you intend this to be useful it has to help build the chain, not
just rely on hardwiring checkpoints once rule changes are presumed to
be buried deeply enough to do so (as the result of other implementations
).
I understand this approach, it was ours at one time. There is a
significant difference, and your design is to some degree based on a
failure to fully consider this. I encourage you to not assume any
consensus-related detail is too small.

I am not failing to consider this, and I don't consider this too small .
But ensuring contextual transaction validity by "validate => valid with
rules X,Y,Z" and then checking the active rules (softfork activation) on
order validation, will give logically the same results as "validate with
X,Y,Z => valid". This is not "hardwiring checkpoints" at all.

Post by Eric Voskuil via bitcoin-dev
You cannot have a useful performance measure without full compliance.

I agree that the results are preliminary and I will post more if the
product reaches later stages.

Post by Eric Voskuil via bitcoin-dev
It's worth noting that many of your stated objectives, including
modularity, developer platform, store isolation, consensus rule
isolation (including optional use of libbitcoinconsensus) are implemente
d.
It seems like you are doing some good work and it's not my intent to
discourage that. Libbitcoin is open source, I don't get paid and I'm
not selling anything. But if you are going down this path you should
be aware of it and may benefit from our successes as well as some of
the other stuff :). And hopefully we can get the benefit of your
insights as well.

Thank you, I will definitely further dive into libbitcoin, and see what
insights I can use for Bitcrust.

Tomas

Eric Voskuil via bitcoin-dev

2017-04-11 09:41:34 UTC

Post by Tomas via bitcoin-dev
Splitting transactions only happens *on storage* and is just a
minor optimization compared to storing them in full.

Post by Tomas via bitcoin-dev
Sure, we can still call switching tips a "reorg". And it is indeed
a trade off as orphan blocks are stored, but a block in the spend
tree takes only ~12kb and contains the required state information.

It's not the headers/tx-hashes of the blocks that I'm referring to, it
is the confirmation and spend information relative to all txs and all
outputs for each branch. This reverse navigation (i.e. utxo
information) is essential, must be persistent and is branch-relative.

Post by Tomas via bitcoin-dev
The blockchain is - by design - only eventually consistent across
nodes. Even if nodes would use the same "tip-selection" rules, you
cannot rely on all blocks being propagated and hence each
transaction having the same number of confirmations across all
nodes.
As a simpler example, if two miners both mine a block at
approximately the same time and send it to each other, then surely
they would want to continue mining on their own block. Otherwise
they would be throwing away their own reward.

Post by Tomas via bitcoin-dev
And yes, this can also happen over multiple blocks, but the chances
of consistency are vastly increased with each confirmation.

Consistency is reached, despite seeing things at different times,
because people use the same rules. If the economy ran on arbitrary
block preference consistency would be elusive.

Post by Tomas via bitcoin-dev
I am not talking about rejecting blocks, I am only talking choosing
on which tip to mine.

This line of reasoning has me a bit baffled. Yet as I said, it's not
important to the question at hand. It is not likely to be optimal to
validate concurrently even if you consider selection of a weaker block
advantageous.

Post by Eric Voskuil via bitcoin-dev
If you intend this to be useful it has to help build the chain,
not just rely on hardwiring checkpoints once rule changes are
presumed to be buried deeply enough to do so (as the result of
other implementations ).
I understand this approach, it was ours at one time. There is a
significant difference, and your design is to some degree based
on a failure to fully consider this. I encourage you to not
assume any consensus-related detail is too small.

I am not failing to consider this, and I don't consider this too
small . But ensuring contextual transaction validity by "validate
=> valid with rules X,Y,Z" and then checking the active rules
(softfork activation) on order validation, will give logically the
same results as "validate with X,Y,Z => valid". This is not
"hardwiring checkpoints" at all.

Storing the validation flags with each tx is exactly what libbitcoin
does (otherwise pre-validation would be infeasible). But that was not

...height-based compliance as meta data of validation seems to
be

adequate and safe.

I read this as encoding the height at which a fork historically
activated. If you intend to track activation for each branch that will
not be "height-based" it will be history based.

e

Tomas via bitcoin-dev

2017-04-11 10:04:01 UTC

Post by Eric Voskuil via bitcoin-dev
It's not the headers/tx-hashes of the blocks that I'm referring to, it
is the confirmation and spend information relative to all txs and all
outputs for each branch. This reverse navigation (i.e. utxo
information) is essential, must be persistent and is branch-relative.

That is exactly what is stored in the spend-tree.

Post by Tomas via bitcoin-dev
As a simpler example, if two miners both mine a block at
approximately the same time and send it to each other, then surely
they would want to continue mining on their own block. Otherwise
they would be throwing away their own reward.

That's not your concurrent validation scenario. In the scenario you
described, the person chooses the weaker block of two that require
validation because it's better somehow, not because it's his own
(which does not require validation).
Consistency is reached, despite seeing things at different times,
because people use the same rules. If the economy ran on arbitrary
block preference consistency would be elusive.

No but my example shows that it is up to the miner to choose which tip
to work on. This is not using different rules, it is just optimizing its
income. This means that the economy *does* run on arbitrary "block
preference", even if it is not running on arbitrary rules.

If two blocks are competing, a miner could optimize its decision which
to mine on, not just on whether one of the blocks is his own, but also
on fees, or on excessive validation costs.

Post by Eric Voskuil via bitcoin-dev
I read this as encoding the height at which a fork historically
activated. If you intend to track activation for each branch that will
not be "height-based" it will be history based.

I understand "height-based" was not the right wording, as it is of
course branch-specific. Per tip ruleset metadata, must be matched with
per-transaction ruleset metadata.

Tomas

Gregory Maxwell via bitcoin-dev

2017-04-07 01:09:26 UTC

Bitcrust separates script validation (base load, when transaction come
in) from order validation (peak load, when blocks come in).

How do you deal with validity rules changing based on block height?

For script validation it would obviously need the ~2GB (or I think
~1.5GB) of outputs needed to validate these.

So it sounds like to work the software still needs an analog of a
(U)TXO database? I am confused by the earlier comments about thinking
the the resource consumption of the (U)TXO database is not a
consideration in your design.

For order validation it
needs ~200mb or the spent-index (for bit-lookups) and I would guess
roughly ~500mb of the spent-tree (for scanning), though I don't think
the 5.7GB full spend tree isn't worth pruning anytime soon.

If you get a transaction claiming to spend 0xDEADBEEFDEADBEEF, an
output that never existed how does your spent index reject this spend?

Tomas via bitcoin-dev

2017-04-07 01:29:07 UTC

Post by Gregory Maxwell via bitcoin-dev
How do you deal with validity rules changing based on block height?

I expected that one :). Just like the 100 blocks coinbase rule, changes
by softforks need to be added as metadata to the transaction-index, but
this is not yet in place.

As for the script validation itself using libbitcoinconsensus, this is a
bit hairy as this expects the rules to be known. Luckily, simply
gradually retrying using "lower" rules won't hurt performance, as
transaction that mismatch newer rules are rare.

Generally, bitcrust would appreciate a "is valid with X rules" instead
of a "validate with X rules" approach.

Post by Gregory Maxwell via bitcoin-dev
So it sounds like to work the software still needs an analog of a
(U)TXO database? I am confused by the earlier comments about thinking
the the resource consumption of the (U)TXO database is not a
consideration in your design.

No, but transactional access is. Bitcrust just uses a
*transaction-index*, where outputs can be looked up regardless of being
spent. As the concept of being "spent" depends on the branch, script
validation ignores this and simply looks up the outputs.

Transactions are split in two parts for better locality of reference
when accessing outputs.

The transaction index only looks similar to an "UTXO-index" after full
pruning.

Post by Gregory Maxwell via bitcoin-dev
If you get a transaction claiming to spend 0xDEADBEEFDEADBEEF, an
output that never existed how does your spent index reject this spend

The spend-tree is scanned until either DEADBEAF is found (=ERR double
spent), the transaction of DEADBEEF is found (=all ok!), or the start
of the chain is reached (=ERR spending unknown output!)

To prevent actually having to scan to genesis, the spent-index "catches"
the search after a few blocks and performs the same lookup (positive for
tx, negative for output) on a simple bit index.

Tom Harding via bitcoin-dev

2017-04-07 18:52:20 UTC

On Apr 6, 2017 6:31 PM, "Tomas via bitcoin-dev" <
bitcoin-***@lists.linuxfoundation.org> wrote:

Bitcrust just uses a *transaction-index*, where outputs can be looked up
regardless of being spent.

A network in which many nodes maintain a transaction index also enables a
class of light node applications that ask peers to prove existence and
spentness of TXO's.

Gregory Maxwell via bitcoin-dev

2017-04-07 19:42:22 UTC

On Fri, Apr 7, 2017 at 6:52 PM, Tom Harding via bitcoin-dev

Post by Tom Harding via bitcoin-dev
A network in which many nodes maintain a transaction index also enables a
class of light node applications that ask peers to prove existence and
spentness of TXO's.

Only with the additional commitment structure such as those proposed
by Peter Todd in his stxo/txo commitment designs, e.g.
https://petertodd.org/2016/delayed-txo-commitments

Tom Harding via bitcoin-dev

2017-04-08 18:27:19 UTC

On Apr 7, 2017 12:42, "Gregory Maxwell" <***@xiph.org> wrote:

On Fri, Apr 7, 2017 at 6:52 PM, Tom Harding via bitcoin-dev

Only with the additional commitment structure such as those proposed
by Peter Todd in his stxo/txo commitment designs, e.g.
https://petertodd.org/2016/delayed-txo-commitments

Light nodes are improved by detecting invalid transactions, even before
they are mined.

Tomas via bitcoin-dev

2017-04-08 19:23:40 UTC

Post by Tom Harding via bitcoin-dev

Post by Gregory Maxwell via bitcoin-dev
On Fri, Apr 7, 2017 at 6:52 PM, Tom Harding via bitcoin-dev

Only with the additional commitment structure such as those proposed
by Peter Todd in his stxo/txo commitment designs, e.g.
https://petertodd.org/2016/delayed-txo-commitments

Light nodes are improved by detecting invalid transactions, even
before they are mined.
_________________________________________________

I am not quite sure why you think this approach would help in this
regard. I may be missing part of how Core works here, but Bitcrust's
txindex is merely used to lookup transactions from hashes and currently,
and seems to fulfil the same role as Core's -txindex mode.

This can be pruned, and in the future auto-pruned as the "flat files"
used as base for all data allow for concurrent pruning. But unlike Core,
it is always needed as without UTXO index, it is needed to find outputs
during base load validation.

Marcos mayorga via bitcoin-dev

2017-04-07 07:55:48 UTC

Hi Tomas,

I've read it and think it is an excellent work, I'd like to see it
integrated into bitcoin-core as a 'kernel module'.

I see there are a lot of proof of concepts out there, IMO every one
deserve a room in the bitcoin client as a selectable feature, to make the
software more flexible and less dictatorial, an user could easily select
which features she wants to run.

Best regards,
Marcos

Post by Tomas via bitcoin-dev
I have been working on a bitcoin implementation that uses a different
approach to indexing for verifying the order of transactions. Instead of
using an index of unspent outputs, double spends are verified by using a
spend-tree where spends are scanned against spent outputs instead of
unspent outputs.
This allows for much better concurrency, as not only blocks, but also
individual inputs can be verified fully in parallel.
I explain the approach at https://bitcrust.org, source code is available
at https://github.com/tomasvdw/bitcrust
I am sharing this not only to ask for your feedback, but also to call
for a clear separation of protocol and implementations: As this
solution, reversing the costs of outputs and inputs, seems to have
excellent performance characteristics (as shown in the test results),
updates to the protocol addressing the UTXO growth, might not be worth
considering *protocol improvements* and it might be best to address
these concerns as implementation details.
Kind regards,
Tomas van der Wansem
Bitcrust
_______________________________________________
bitcoin-dev mailing list
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev

Tomas via bitcoin-dev

2017-04-07 08:47:56 UTC

Thank you Marcos,

Though written in Rust, bitcrust-db is definitely usable as pluggable
module as its interface will be roughly some queries, add_tx and
add_block with blobs and flags. (Bitcrust internally uses a
deserialize-only model, keeping references to the blobs with the parsed
data).

However, from Core's side I believe network and storage are currently
rather tightly coupled, which will make this far from trivial.

Regardless, I am also hoping (with funding & a team) to build a Bitcrust
networking component as well to bring a strong competitor to the market.

best,
Tomas

Post by Eric Voskuil via bitcoin-dev
Hi Tomas,
I've read it and think it is an excellent work, I'd like to see it
integrated into bitcoin-core as a 'kernel module'.
I see there are a lot of proof of concepts out there, IMO every one
deserve a room in the bitcoin client as a selectable feature, to make the
software more flexible and less dictatorial, an user could easily select
which features she wants to run.
Best regards,
Marcos

Greg Sanders via bitcoin-dev

2017-04-07 14:14:31 UTC

Interesting work.

I was wondering if you could tell us what specs for the machine being used
as preliminary benchmark is here: https://bitcrust.org/results ?

I'd be interested to also see comparisons with 0.14 which has some
improvements for script validation with more cores.

On Fri, Apr 7, 2017 at 4:47 AM, Tomas via bitcoin-dev <

Post by Tomas via bitcoin-dev
Thank you Marcos,
Though written in Rust, bitcrust-db is definitely usable as pluggable
module as its interface will be roughly some queries, add_tx and
add_block with blobs and flags. (Bitcrust internally uses a
deserialize-only model, keeping references to the blobs with the parsed
data).
However, from Core's side I believe network and storage are currently
rather tightly coupled, which will make this far from trivial.
Regardless, I am also hoping (with funding & a team) to build a Bitcrust
networking component as well to bring a strong competitor to the market.
best,
Tomas

Post by Tomas via bitcoin-dev
I have been working on a bitcoin implementation that uses a different
approach to indexing for verifying the order of transactions. Instead

Post by Tomas via bitcoin-dev
using an index of unspent outputs, double spends are verified by using

Post by Tomas via bitcoin-dev
spend-tree where spends are scanned against spent outputs instead of
unspent outputs.
This allows for much better concurrency, as not only blocks, but also
individual inputs can be verified fully in parallel.
I explain the approach at https://bitcrust.org, source code is

available

Post by Tomas via bitcoin-dev
at https://github.com/tomasvdw/bitcrust
I am sharing this not only to ask for your feedback, but also to call
for a clear separation of protocol and implementations: As this
solution, reversing the costs of outputs and inputs, seems to have
excellent performance characteristics (as shown in the test results),
updates to the protocol addressing the UTXO growth, might not be worth
considering *protocol improvements* and it might be best to address
these concerns as implementation details.
Kind regards,
Tomas van der Wansem
Bitcrust
_______________________________________________
bitcoin-dev mailing list
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev

_______________________________________________
bitcoin-dev mailing list
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev

Tomas via bitcoin-dev

2017-04-07 16:02:35 UTC

Thank you,

The benches are running in Google Cloud Engine; currently on 8 vCPU
32gb, but I tend to switch hardware regularly.

Roughly, the results are better for Bitcrust with high end hardware and
the difference for total block validations is mostly diminished at 2
vCPU, 7,5 gb.

Note that the spend-tree optimization primarily aims to improve peak
load order validation; when a block with pre-synced transactions comes
in, but this is tricky to accurately bench with Core using this simple
method of comparison by logs.

I will upgrade to, and show the results against 0.14 in the next weeks.

Best,

Tomas

Post by Greg Sanders via bitcoin-dev
Interesting work.
I was wondering if you could tellank us what specs for the machine
https://bitcrust.org/results ?
I'd be interested to also see comparisons with 0.14 which has some
improvements for script validation with more cores.
On Fri, Apr 7, 2017 at 4:47 AM, Tomas via bitcoin-dev <bitcoin-

Post by Tomas via bitcoin-dev
Thank you Marcos,
Though written in Rust, bitcrust-db is definitely usable as
pluggable
module as its interface will be roughly some queries, add_tx and
add_block with blobs and flags. (Bitcrust internally uses a
deserialize-only model, keeping references to the blobs with the parsed
data).
However, from Core's side I believe network and storage are
currently
rather tightly coupled, which will make this far from trivial.
Regardless, I am also hoping (with funding & a team) to build a Bitcrust
networking component as well to bring a strong competitor to the market.
best,
Tomas

Post by Eric Voskuil via bitcoin-dev
Hi Tomas,
I've read it and think it is an excellent work, I'd like to see it
integrated into bitcoin-core as a 'kernel module'.
I see there are a lot of proof of concepts out there, IMO
every one
deserve a room in the bitcoin client as a selectable feature, to make the
software more flexible and less dictatorial, an user could easily select
which features she wants to run.
Best regards,
Marcos

_______________________________________________
bitcoin-dev mailing list
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev

Gregory Maxwell via bitcoin-dev

2017-04-07 18:18:32 UTC

On Thu, Apr 6, 2017 at 10:12 PM, Tomas via bitcoin-dev

Post by Tomas via bitcoin-dev
As this
solution, reversing the costs of outputs and inputs, seems to have
excellent performance characteristics (as shown in the test results),
updates to the protocol addressing the UTXO growth, might not be worth
considering *protocol improvements*

I'm still lost on this-- AFAICT your proposals long term resource
requirements are directly proportional to the amount of unspent output
data, which grows over time at some fraction of the total transaction
volume (plus the rate of spending which is more or less a constant).

Can you help out my understanding here?

Bram Cohen via bitcoin-dev

2017-04-07 18:39:18 UTC

Expanding on this question a bit, it's optimized for parallel access, but
hard drive access isn't parallel and memory accesses are very fast, so
shouldn't the target of optimization be about cramming as much as possible
in memory and minimizing disk accesses?

On Fri, Apr 7, 2017 at 11:18 AM, Gregory Maxwell via bitcoin-dev <

Post by Gregory Maxwell via bitcoin-dev
On Thu, Apr 6, 2017 at 10:12 PM, Tomas via bitcoin-dev

Eric Voskuil via bitcoin-dev

2017-04-07 19:55:58 UTC

Post by Bram Cohen via bitcoin-dev
Expanding on this question a bit, it's optimized for parallel
access, but hard drive access isn't parallel and memory accesses
are very fast, so shouldn't the target of optimization be about
cramming as much as possible in memory and minimizing disk
accesses?

While this may seem to be the case it is not generally optimal. The
question is overly broad as one may or may not be optimizing for any
combination of:

startup time (first usability)
warm-up time (priming)
shutdown time (flush)
fault tolerance (hard shutdown survivability)
top block validation (read speed)
full chain validation (read/write speed)
RAM consumption
Disk consumption
Query response
Servers (big RAM)
Desktops (small RAM)
Mining (fast validation)
Wallets (background performance)
SSD vs. HDD

But even limiting the question to input validation, all of these
considerations (at least) are present.

Ideally one wants the simplest implementation that is optimal under
all considerations. While this may be a unicorn, it is possible to
achieve a simple implementation (relative to alternatives) that allows
for the trade-offs necessary to be managed through configuration (by
the user and/or implementation).

Shoving the entire data set into RAM has the obvious problem of
limited RAM. Eventually the OS will be paging more of the data back to
disk (as virtual RAM). In other words this does not scale, as a change
in hardware disproportionately impacts performance. Ideally one wants
the trade between "disk" and "memory" to be made by the underlying
platform, as that is its purpose. Creating one data structure for disk
and another for memory not only increases complexity, but denies the
platform visibility into this trade-off. As such the platform
eventually ends up working directly against the optimization.

An on-disk structure that is not mapped into memory by the application
allows the operating system to maintain as much or as little state in
memory as it considers optimal, given the other tasks that the user
has given it. In the case of memory mapped files (which are optimized
by all operating systems as central to their virtual memory systems)
it is possible for everything from zero to the full store to be memory
resident.

Optimization for lower memory platforms then becomes a process of
reducing the need for paging. This is the purpose of a cache. The seam
between disk and memory can be filled quite nicely by a small amount
of cache. On high RAM systems any cache is actually a de-optimization
but on low RAM systems it can prevent excessive paging. This is
directly analogous to a CPU cache. There are clear optimal points in
terms of cache size, and the implementation and management of such a
cache can and should be internal to a store. Of course a cache cannot
provide perfect scale all the way to zero RAM, but it scales quite
well for actual systems.

While a particular drive may not support parallel operations one
should not assume that a disk-based store does not benefit from
parallelism. Simply refer to the model described above and you will
see that with enough memory the entire blockchain can be
memory-resident, and for high performance operations a fraction of
that is sufficient for a high degree of parallelism.

In practice a cache of about 10k transactions worth of outputs is
optimal for 8GB RAM. This requires just a few blocks for warm-up,
which can be primed in inconsequential time at startup. Fault
tolerance can be managed by flushing after all writes, which also
reduces shutdown time to zero. For higher performance systems,
flushing can be disabled entirely, increasing shutdown time but also
dramatically increasing write performance. Given that the blockchain
is a cache, this is a very reasonable trade-off in some scenarios. The
model works just as well with HDD as SSD, although certainly SSD
performs better overall.

e

Tomas via bitcoin-dev

2017-04-07 21:44:43 UTC

Hi Eric,

Post by Eric Voskuil via bitcoin-dev
Optimization for lower memory platforms then becomes a process of
reducing the need for paging. This is the purpose of a cache. The seam
between disk and memory can be filled quite nicely by a small amount
of cache. On high RAM systems any cache is actually a de-optimization
but on low RAM systems it can prevent excessive paging. This is
directly analogous to a CPU cache.

I am not entirely sure I agree with that, or understand it correctly.

If -for example - the data of some application is a set of records
which can be sorted from least frequently used to most frequently used
then doing just that sort will beat any application-layer cache.
Regardless of size of data and size of RAM, you simply allow the OS to
use disk caching or memory map caching to work its magic .

In fact, I would argue that an application-layer cache *only* makes
sense if the data model shows a *hard* distinction between often and not
often used data. If usage-frequency is a continuous line, caching is
best left to the OS by focussing on proper spatial and temporal locality
of reference of your data, because the OS has much more information to
make the right decision.

Eric Voskuil via bitcoin-dev

2017-04-07 23:51:08 UTC

Post by Tomas via bitcoin-dev
Hi Eric,

Post by Eric Voskuil via bitcoin-dev
Optimization for lower memory platforms then becomes a process
of reducing the need for paging. This is the purpose of a cache.
The seam between disk and memory can be filled quite nicely by a
small amount of cache. On high RAM systems any cache is actually
a de-optimization but on low RAM systems it can prevent excessive
paging. This is directly analogous to a CPU cache.

It's a reasonable assumption, and given that the no-explicit-cache
implementation is a subset of the optionally-cached implementation,
was of course the initial implementation.

Post by Tomas via bitcoin-dev
In fact, I would argue that an application-layer cache *only*
makes sense if the data model shows a *hard* distinction between
often and not often used data. If usage-frequency is a continuous
line, caching is best left to the OS by focussing on proper spatial
and temporal locality of reference of your data, because the OS has
much more information to make the right decision.

In practice this is not the case. The Bitcoin data model is neither
continuous nor strictly segregated by usage.

It is true that with sufficient RAM a cache is totally
counterproductive. It is also my experience that an independent UTXO
store is not a reasonable/necessary trade of disk space, memory
scalability, and/or code complexity in exchange for speed.

But on lower memory systems a explicit cache is beneficial. The
difference is clearly measurable in production code by simply changing
the cache limit and testing on various configurations.

e

Tomas via bitcoin-dev

2017-04-07 21:14:51 UTC

Post by Bram Cohen via bitcoin-dev

Answering both,

Post by Gregory Maxwell via bitcoin-dev
I'm still lost on this-- AFAICT your proposals long term resource
requirements are directly proportional to the amount of
unspent output
data, which grows over time at some fraction of the total transaction
volume (plus the rate of spending which is more or less a constant).
Can you help out my understanding here?

Expanding on this question a bit, it's optimized for parallel access,
but hard drive access isn't parallel and memory accesses are very
fast, so shouldn't the target of optimization be about cramming as
much as possible in memory and minimizing disk accesses?

The long term *minimal disk storage* requirement, can obviously not be
less then all the unspent outputs. Minimal disk requirements is not
something bitcrust attempts to address.

The storage that is accessed during peak load (block validation with
pre-synced transactions), is minimized as this only needs the
transaction index (to lookup ptrs from hashes), the tip of the spend-
tree and the tip of the spend-index (together to check double
spents/spending non-existing outputs). These not only easily fit in
RAM, but are accessed in a cache efficient way. *These* only grow with
inputs as the spend tree contains one record per input referencing the
output being spent.

Script validation is also not something bitcrust *directly* addresses;
it uses libbitcoinconsensus for the actual validation and lookups to
outputs are mostly similar. They are kept fast by trusting the OS on MRU
caching of transaction-outputs; I don't think that for this part the
UTXO index has much drawbacks,. Bitcrust seems to have a small advantage
due to the awesomeness of Rayon's parallelization and the lock-free data
structures, but a disadvantage in that keeping all spent outputs
decreases spatial locality of reference. Script validation is not the
innovative part.

Tomas

Gregory Maxwell via bitcoin-dev

2017-04-08 00:44:50 UTC

The long term *minimal disk storage* requirement, can obviously not be less
then all the unspent outputs.

Then I think you may want to retract the claim that "As this solution,
reversing the costs of outputs and inputs, [...] updates to the
protocol addressing the UTXO growth, might not be worth considering
*protocol improvements* "

As you note that the output costs still bound the resource
requirements. Short of radical protocol changes like TXO-proofs the
UTXO data remains a driving unavoidable long term resource cost, not
an implementation detail. Implementation optimizations like improving
locality further or keeping spentness in memory do not change this
fact.

The storage that is accessed during peak load (block validation with
pre-synced transactions), is minimized as this only needs the transaction
index (to lookup ptrs from hashes), the tip of the spend-tree and the tip of

Latency related costs in Bitcoin Core also do not depend on the number
of outputs in transactions in a block. When a transaction is handled
it goes into an in-memory buffer and only gets flushed later if isn't
spent before the buffer fills. A block will take more time to
validate with more inputs, same as you observer, but the aggregate
resource usage for users depends significantly on outputs (so, in fact
there is even further misaligned incentives than just the fact that
small outputs have a outsized long term cost).

Tomas via bitcoin-dev

2017-04-08 07:28:48 UTC

Post by Gregory Maxwell via bitcoin-dev
As you note that the output costs still bound the resource
requirements.

Resource cost is not just a measure of storage requirement; data that
needs to be accessed during peak load induce more cost then data only
used during base load or only rarely used.

Post by Gregory Maxwell via bitcoin-dev
Latency related costs in Bitcoin Core also do not depend on the number
of outputs in transactions in a block. When a transaction is handled
it goes into an in-memory buffer and only gets flushed later if isn't
spent before the buffer fills. A block will take more time to
validate with more inputs, same as you observer, but the aggregate
resource usage for users depends significantly on outputs (so, in fact
there is even further misaligned incentives than just the fact that
small outputs have a outsized long term cost).

In Core, when a block comes the inputs are checked against the UTXO set
(which grows with outputs) even if pre-synced, to verify order. Am I
wrong there? This is not in the case in bitcrust; it is instead checked
against the spend-tree (which grows with inputs).

How "significant" this is, I neither know nor claim, but it is an
interesting difference.

Post by Gregory Maxwell via bitcoin-dev
Then I think you may want to retract the claim that "As this solution,
reversing the costs of outputs and inputs, [...] updates to the
protocol addressing the UTXO growth, might not be worth considering
*protocol improvements* "

I think you are being a bit harsh here . I am also clearly explaining
the difference only applies to peak load, and just making a suggestion.
I simply want to stress the importance of protocol / implementation
separation as even though you are correct UTXO data is always a resource
cost for script validation (as I also state), the ratio of different
costs are not necessarily *identical* across implementation.

Note that the converse also holds: In bitcrust, if the last few blocks
contain many inputs, the peak load verification for this block is
slower. This is not the case in Core.

Tomas

Johnson Lau via bitcoin-dev

2017-04-08 19:23:29 UTC

Post by Tomas via bitcoin-dev
I think you are being a bit harsh here . I am also clearly explaining
the difference only applies to peak load, and just making a suggestion.
I simply want to stress the importance of protocol / implementation
separation as even though you are correct UTXO data is always a resource
cost for script validation (as I also state), the ratio of different
costs are not necessarily *identical* across implementation.
Note that the converse also holds: In bitcrust, if the last few blocks
contain many inputs, the peak load verification for this block is
slower. This is not the case in Core.
Tomas

I don’t fully understand your storage engine. So the following deduction is just based on common sense.

a) It is possible to make unlimited number of 1-in-100-out txs

b) The maximum number of 100-in-1-out txs is limited by the number of previous 1-in-100-out txs

c) Since bitcrust performs not good with 100-in-1-out txs, for anti-DoS purpose you should limit the number of previous 1-in-100-out txs.

d) Limit 1-in-100-out txs == Limit UTXO growth

I’m not surprised that you find an model more efficient than Core. But I don’t believe one could find a model that doesn’t become more efficient with UTXO growth limitation.

Maybe you could try an experiment with regtest? Make a lot 1-in-100-out txs with many blocks, then spend all the UTXOs with 100-in-1-out txs. Compare the performance of bitcrust with core. Then repeat with 1-in-1-out chained txs (so the UTXO set is always almost empty)

One more question: what is the absolute minimum disk and memory usage in bitcrust, compared with the pruning mode in Core?

Tomas via bitcoin-dev

2017-04-08 19:56:18 UTC

Post by Johnson Lau via bitcoin-dev
I don’t fully understand your storage engine. So the following deduction
is just based on common sense.
a) It is possible to make unlimited number of 1-in-100-out txs
b) The maximum number of 100-in-1-out txs is limited by the number of
previous 1-in-100-out txs
c) Since bitcrust performs not good with 100-in-1-out txs, for anti-DoS
purpose you should limit the number of previous 1-in-100-out txs.
d) Limit 1-in-100-out txs == Limit UTXO growth
I’m not surprised that you find an model more efficient than Core. But I
don’t believe one could find a model that doesn’t become more efficient
with UTXO growth limitation.

My efficiency claims are *only* with regards to order validation. If we
assume all transactions are already pre-synced and verified, bitcrust's
order validation is very fast, and (only slightly) negatively effected
by input-counts.

Most total time is spend during base load script validation, and UTXO
growth is the definitely the limiting factor there, as the model here
isn't all that different from Core's.

Post by Johnson Lau via bitcoin-dev
Maybe you could try an experiment with regtest? Make a lot 1-in-100-out
txs with many blocks, then spend all the UTXOs with 100-in-1-out txs.
Compare the performance of bitcrust with core. Then repeat with
1-in-1-out chained txs (so the UTXO set is always almost empty)

Again, this really depends on whether we focus on full block validation,
in which case the 100-1, 1-100 distinction will be the similar to Core,
or only regard order validation, in which case Bitcrust will have this
odd reversal.

Post by Johnson Lau via bitcoin-dev
One more question: what is the absolute minimum disk and memory usage in
bitcrust, compared with the pruning mode in Core?

As bitcrust doesn't support this yet, I cannot give accurate numbers,
but I've provided some numbers estimates earlier in the thread.

Rereading my post and these comments, I may have stepped on some toes
with regards to SegWit's model. I like SegWit (though I may have a
slight preference for BIP140), and I understand the reasons for the
"discount", so this was not my intention. I just think that the reversal
of costs during peak load order validation is a rather interesting
feature of using spend-tree based validation.

Tomas

Johnson Lau via bitcoin-dev

2017-04-08 20:21:04 UTC

pre-synced means already in mempool and verified? Then it sounds like we just need some mempool optimisation? The tx order in a block is not important, unless they are dependent

Post by Johnson Lau via bitcoin-dev
One more question: what is the absolute minimum disk and memory usage in
bitcrust, compared with the pruning mode in Core?

As bitcrust doesn't support this yet, I cannot give accurate numbers,
but I've provided some numbers estimates earlier in the thread.
Rereading my post and these comments, I may have stepped on some toes
with regards to SegWit's model. I like SegWit (though I may have a
slight preference for BIP140), and I understand the reasons for the
"discount", so this was not my intention. I just think that the reversal
of costs during peak load order validation is a rather interesting
feature of using spend-tree based validation.
Tomas

Please no conspiracy theory like stepping on someone’s toes. I believe it’s always nice to challenge the established model. However, as I’m trying to make some hardfork design, I intend to have a stricter UTXO growth limit. As you said "protocol addressing the UTXO growth, might not be worth considering protocol improvements*, it sounds like UTXO growth limit wouldn’t be very helpful for your model, which I doubt.

Tomas via bitcoin-dev

2017-04-08 20:42:57 UTC

Post by Johnson Lau via bitcoin-dev
Please no conspiracy theory like stepping on someone’s toes. I believe
it’s always nice to challenge the established model. However, as I’m
trying to make some hardfork design, I intend to have a stricter UTXO
growth limit. As you said "protocol addressing the UTXO growth, might not
be worth considering protocol improvements*, it sounds like UTXO growth
limit wouldn’t be very helpful for your model, which I doubt.

Thank you. I realize that this particular phrase implies that in my
design, outputs are less costly then inputs, *in total resource costs*,
which I can not defend without completely ignoring base load script
verification. I rephrased it.

Tomas

Gregory Maxwell via bitcoin-dev

2017-04-08 22:12:09 UTC

Post by Johnson Lau via bitcoin-dev
pre-synced means already in mempool and verified? Then it sounds like we just need some mempool optimisation? The tx order in a block is not important, unless they are dependent

In Bitcoin Core the software _explicitly_ and intentionally does not
exploit mempool pre-validation because doing that very easily leads to
hard to detect consensus faults and makes all mempool code consensus
critical when it otherwise is not. There have been bugs in the past
which would have split the network if this optimization had been used.

(in particular, I believe I recall one related to correctly removing
coinbase spends from the mempool during reorganization that made them
immature; and with the optimization and without the CNB post-test
would have resulted in nodes that saw the reorg creating and accepting
an invalid block, while nodes that didn't rejecting it; but because of
prudent design it was largely harmless).

Because signature validation is cached, and takes the majority of the
block validation time the speed up from the risky optimization isn't
that considerable, and there are other lower hanging fruity with
bigger payouts like Pieter's change to the per-txout management model
and the new non-atomic flushing logic.... and these things don't make
more of the system consensus critical.

Tomas via bitcoin-dev

2017-04-08 22:34:11 UTC

Post by Gregory Maxwell via bitcoin-dev
In Bitcoin Core the software _explicitly_ and intentionally does not
exploit mempool pre-validation because doing that very easily leads to
hard to detect consensus faults and makes all mempool code consensus
critical when it otherwise is not. There have been bugs in the past
which would have split the network if this optimization had been used.
(in particular, I believe I recall one related to correctly removing
coinbase spends from the mempool during reorganization that made them
immature; and with the optimization and without the CNB post-test
would have resulted in nodes that saw the reorg creating and accepting
an invalid block, while nodes that didn't rejecting it; but because of
prudent design it was largely harmless).

Although I don't quite follow the details (CNB post-test? Connect block
I assume?), the risks you are describing seem to be rather specific to
Core's implementation. For one, bitcrust does not or use need reorgs at
all.

Do you argue (or can you further explain) that the idea of splitting
script validation (or what you call mempool pre-validation), and order
validation is introducing risks inherent to the protocol?

Thanks,
Tomas

Troy Benjegerdes via bitcoin-dev

2017-04-08 21:22:11 UTC