"GOT", but the "O" is a cute, smiling pufferfish. Index | Thread | Search

Stefan Sperling <stsp@stsp.name>
Re: attempt at speeding up the deltification
Omar Polo <op@omarpolo.com>
Sun, 25 Feb 2024 11:09:23 +0100

Download raw body.

On Sun, Feb 25, 2024 at 02:30:28AM +0100, Omar Polo wrote:
> To get an idea of the improvements, I've extracted the code for the
> deltification as a standalone executable.  For example, to
> got_deltify_init() the alpine iso (207M)
> suzaku% m && time ./obj/d /home/op/Downloads/alpine-standard-3.19.1-x86_64.iso
> array: 2250886/2251136 99% -- ht: 761991/2097152 36%
> total memory: 62415872
>     0m01.81s real     0m01.20s user     0m00.62s system
> or for a 1.0G mp4 video:
> suzaku% m && time ./obj/d /home/op/Downloads/[...].mp4
> array: 9967559/9967744 99% -- ht: 4011730/8388608 47%
> total memory: 272780288
>     0m09.67s real     0m06.12s user     0m03.49s system
> The "total memory" is computed by considering the total memory needed by
> the array and by the table (so around 62M and 272M respectively.)  IMHO
> the time are more than decent now.  I have no idea how much it'll take
> with the current implementation, but given the trend it exibits (~3
> minutes for 100M of /dev/random), "too much" is a good estimate ;-)

Could you still run a test on the same set of files using the
old code please, for comparison?

And show how well the new code performs on 100MB of /dev/random?

If this change brings us from an order of minutes down to an order
of less than 10 seconds, that's very impressive.

However, comparing deltification of /dev/random to deltification
of structured files could be an apple vs. oranges comparison.
There is little chance of finding common blocks within /dev/random.