From: Omar Polo Subject: Re: faster reuse of deltas To: Stefan Sperling Cc: gameoftrees@openbsd.org Date: Sat, 03 Dec 2022 17:28:54 +0100 On 2022/12/03 14:51:37 +0100, Stefan Sperling wrote: > On Sat, Dec 03, 2022 at 12:29:54PM +0100, Stefan Sperling wrote: > > At present, 'gotadmin pack' copies reused deltas to the delta cache file, > > and then copies deltas from this cache into the generated pack file. > > The extra copy operation is redundant. It was implemented because, at > > some earlier point in development history, the code which copied data > > to the pack file was hard-wired to compress data while writing it out. > > So we needed to store reused deltas in decompressed form somewhere. > > > > Nowadays, the cache is storing deltas in compressed form, and the extra > > copy operation just slows us down for no good reason. Deltas are being > > parsed and provided by got-read-pack either way, so this helper remains > > in full control over the data which the main process ends up copying (but > > not parsing). > > > > With the patch below, got-read-pack tells the main process only about the > > locations of reused deltas in the original pack file, and the main process > > reads deltas directly from that file. This speeds things up and will make > > it easier to implement reuse of non-deltified packed objects later. > > Updated diff. > The previous version of this diff was missing necessary changes for gotd(8). re-read a couple of times, ok for me (and looking forward to the reuse!) > [...] > blob - 5ede5726ac196d5b3ba820a830a44852e453b174 > blob + 7500a424c3c703e986507e47bad61b771851f850 > --- lib/repository.c > +++ lib/repository.c > @@ -1519,7 +1519,8 @@ got_repo_pin_pack(struct got_repository *repo, struct > > repo->pinned_pack = pinned_pack; > repo->pinned_packidx = pinned_packidx; > - repo->pinned_pid = repo->packs[pinned_pack].privsep_child->pid; > + if (repo->packs[pinned_pack].privsep_child) > + repo->pinned_pid = repo->packs[pinned_pack].privsep_child->pid; > return NULL; > } not an issue, but I'm not sure about the rationale for this hunk in this diff. > blob - e49fd95ba0b7919efe9005b306782db008b22370 > blob + a86f3dde84274fdf37db09da580f7bba1e012e46 > --- libexec/got-read-pack/got-read-pack.c > +++ libexec/got-read-pack/got-read-pack.c > [...] > @@ -1054,7 +1045,7 @@ delta_reuse_request(struct imsg *imsg, struct imsgbuf > > static const struct got_error * > delta_reuse_request(struct imsg *imsg, struct imsgbuf *ibuf, > - FILE *delta_outfile, struct got_pack *pack, struct got_packidx *packidx) > + struct got_pack *pack, struct got_packidx *packidx) tabs when should use four spaces