"GOT", but the "O" is a cute, smiling pufferfish. Index | Thread | Search

From:
Omar Polo <op@omarpolo.com>
Subject:
Re: cache pack index file list when repository is opened
To:
Stefan Sperling <stsp@stsp.name>
Cc:
gameoftrees@openbsd.org
Date:
Mon, 14 Mar 2022 00:08:23 +0100

Download raw body.

Thread
Stefan Sperling <stsp@stsp.name> wrote:
> Profiling shows that many calls to readdir(3) of the objects/pack/
> directory can cause significant overhead during 'gotadmin pack -a'
> if the repository has many pack files that need to be searched.
> 
> This patch caches the list of known pack index files when the
> repository is opened. This speeds up searches by skipping readdir(3)
> if the pack index cache cannot prevent us from hitting the disk.
> 
> Pack files are not really expected to appear or disappear out
> of the blue while the repository is in use, even though this
> is possible. But even readdir(3) followed open(2) of an entry
> could suffer from this. If this becomes a problem it was already
> possible to hit with the old code, and could only be fixed by
> catching relevant errors and retrying.
> 
> ok?

makes sense and the diff is almost straightforward.

There's just a minor thing that caught my attention while reading the
parts affected by the diff, but it was already there.  If the repo
allocation in got_repo_open fails the `goto done' ends up calling
got_repo_close with repo == NULL and segfaulting there.

ok?

diff 43a61642701486d74f4bb5846d07295818e1150c /home/op/w/got (staged changes)
blob - d4175545865a20ba9a3cff43736298aaae5e8c37
blob + e8a00ae9b1dab853c73f41060bc6ab6883e8e6f2
--- lib/repository.c
+++ lib/repository.c
@@ -664,10 +664,8 @@ got_repo_open(struct got_repository **repop, const cha
 		return got_error_from_errno("getrlimit");
 
 	repo = calloc(1, sizeof(*repo));
-	if (repo == NULL) {
-		err = got_error_from_errno("calloc");
-		goto done;
-	}
+	if (repo == NULL)
+		return got_error_from_errno("calloc");
 
 	RB_INIT(&repo->packidx_bloom_filters);