"GOT", but the "O" is a cute, smiling pufferfish. Index | Thread | Search

From:
Omar Polo <op@omarpolo.com>
Subject:
Re: faster reuse of deltas
To:
Stefan Sperling <stsp@stsp.name>
Cc:
gameoftrees@openbsd.org
Date:
Sat, 03 Dec 2022 17:28:54 +0100

Download raw body.

Thread
On 2022/12/03 14:51:37 +0100, Stefan Sperling <stsp@stsp.name> wrote:
> On Sat, Dec 03, 2022 at 12:29:54PM +0100, Stefan Sperling wrote:
> > At present, 'gotadmin pack' copies reused deltas to the delta cache file,
> > and then copies deltas from this cache into the generated pack file.
> > The extra copy operation is redundant. It was implemented because, at
> > some earlier point in development history, the code which copied data
> > to the pack file was hard-wired to compress data while writing it out.
> > So we needed to store reused deltas in decompressed form somewhere.
> > 
> > Nowadays, the cache is storing deltas in compressed form, and the extra
> > copy operation just slows us down for no good reason. Deltas are being
> > parsed and provided by got-read-pack either way, so this helper remains
> > in full control over the data which the main process ends up copying (but
> > not parsing).
> > 
> > With the patch below, got-read-pack tells the main process only about the
> > locations of reused deltas in the original pack file, and the main process
> > reads deltas directly from that file. This speeds things up and will make
> > it easier to implement reuse of non-deltified packed objects later.
> 
> Updated diff.
> The previous version of this diff was missing necessary changes for gotd(8).

re-read a couple of times, ok for me (and looking forward to the
reuse!)

> [...]
> blob - 5ede5726ac196d5b3ba820a830a44852e453b174
> blob + 7500a424c3c703e986507e47bad61b771851f850
> --- lib/repository.c
> +++ lib/repository.c
> @@ -1519,7 +1519,8 @@ got_repo_pin_pack(struct got_repository *repo, struct 
>  
>  	repo->pinned_pack = pinned_pack;
>  	repo->pinned_packidx = pinned_packidx;
> -	repo->pinned_pid = repo->packs[pinned_pack].privsep_child->pid;
> +	if (repo->packs[pinned_pack].privsep_child)
> +		repo->pinned_pid = repo->packs[pinned_pack].privsep_child->pid;
>  	return NULL;
>  }

not an issue, but I'm not sure about the rationale for this hunk in
this diff.

> blob - e49fd95ba0b7919efe9005b306782db008b22370
> blob + a86f3dde84274fdf37db09da580f7bba1e012e46
> --- libexec/got-read-pack/got-read-pack.c
> +++ libexec/got-read-pack/got-read-pack.c
> [...]
> @@ -1054,7 +1045,7 @@ delta_reuse_request(struct imsg *imsg, struct imsgbuf 
>  
>  static const struct got_error *
>  delta_reuse_request(struct imsg *imsg, struct imsgbuf *ibuf,
> -    FILE *delta_outfile, struct got_pack *pack, struct got_packidx *packidx)
> +	struct got_pack *pack, struct got_packidx *packidx)

tabs when should use four spaces