"GOT", but the "O" is a cute, smiling pufferfish. Index | Thread | Search

From:
ori@eigenstate.org
Subject:
Re: deltify addblk() fixes
To:
gameoftrees@openbsd.org, naddy@mips.inka.de
Date:
Wed, 09 Jun 2021 12:46:17 -0400

Download raw body.

Thread
Quoth Christian Weisgerber <naddy@mips.inka.de>:
> 
> If the hash can only detect inequality, shouldn't we still check it
> 
>                 if (len == dt->blocks[i].len && h == dt->blocks[i].hash) {
> 
> to skip expensive compares?

Yes, that would be correct -- I think this is my fault,
as a result of how the code initially evolved. The first
iteration used a worse algorithm with sha1 hashes, and
it turned out to be faster to just compare rather than
hashing.

I fixed the algorithm, but didn't change the lookup. It
doesn't seem to matter much in practice, but it's not
harmful.

As a side note, it may be worth citing the algorithm for
chunking used with modification: FastCDC, from usenix 2016:

	https://www.usenix.org/conference/atc16/technical-sessions/presentation/xia

The block stretching is an adaptation: FastCDC is concerned
with deduping and appending into a persistent store, while
we're just interested in deltifying two objects against each
other.