Post by Kelly DeanMy point remains: Fossil unnecessarily uses an extremely slow hash if
you're expected to trust the data, and uses one that's too close to being
broken to warrant trust if you don't trust the data.
Even on 10 year old hardware, Fossil's performance is tolerable. On 5 year
hardware, it is acceptable. (Of course, if you, for whatever reason, switch
from very recent hardware to, even, 5 year old hardware, it's going to seem
very slow, but that is the case for many applications.)
Post by Kelly DeanJoerg seemed to be suggesting not to use Fossil if you don't trust the
data, but you're suggesting (and I agree) Fossil should be usable even if
you don't trust the data.
I don't know what "threat model" Fossil might have been designed to resist.
"very forgery resistant" doesn't tell me much.
I think Joerg's (and others') point about untrusted was more about the
trustworthiness of contributors with push privilege to a project's
repository. No matter how secure the has function is, the trustworthiness
of contributors has to be adequately evaluated before granting them push
privilege.
As for resistance against attackers, there is a balance dependent upon the
value of a repository's content. If that is forensic evidence from a
criminal investigation, then it might be highly valuable. Likewise,
classified documents.
I don't think that kind of content was in mind when Fossil was designed.
[...]
Post by Kelly DeanThe only options Fossil gives in this case are to not use Fossil, or use a
custom, incompatible derivative of Fossil, or do compare-by-content. The
last option is most practical for now.
It is possible enhance Fossil to do an automatic, post-commit content
compare between the local repository and the local file system. I have also
shown a very basic "wrapper" for Fossil that achieves this (in one of my
previous posts).
Comparing against a remote server is more involved: Request a tar (or zip)
file of the commit just made, unpackage it, then diff it against the
originating workspace. Alternately, individual files can be requested, the
compared. This, too, could be automated by a wrapper around Fossil. An
enhanced Fossil could, of course, do this more efficiently.
Post by Kelly DeanOf course, making the hash function a parameter makes things more
complicated, because your UUIDs have to change (or at least be extended),
and in general all historical uses of them (e.g. in manifests) have to be
securely timestamped (using a current still-secure hash) to enable
detection of forgery of historical records.
The least impractical way to change existing hashes would be to dump the
repository, then replay the commits into a new repository using the old
hashes as tags on the new commits,
Applying the new hash as a tag to the existing commits would be less work,
but the details would have to carefully evaluated and planned. But...
Switching hash algorithms within an existing algorithm would introduce a
lot of extra logic to distinguish and process "old" and "new" commits.
From a practical standpoint, existing repositories would probably continue
to use the old hash.
In theory, supporting new hash algorithms could be handled by incrementing
the "clone protocol" version to allow support of a new attribute that
specifies the hash algorithm used. New "clients" attempting to talk to old
servers could fall back to an earlier protocol version and select SHA1 to
use with that repository.
Ultimately, for any given tool, there is a risk vs cost analysis. For the
projects I work on, Fossil meets my needs. If I were going to store
documents or data that could incur a significant liability, I would use a
tool from a respected vendor so I could demonstrate due diligence in a
legal proceeding. (Note: For all I know, Fossil could be better than the
best tool from the most respected vendor, but it's far easier for opposing
council to shoot down a "free" tool than one purchased from a respected
vendor.)