From: Omar Polo Subject: Re: cache pack index file list when repository is opened To: Stefan Sperling Cc: gameoftrees@openbsd.org Date: Mon, 14 Mar 2022 00:08:23 +0100 Stefan Sperling wrote: > Profiling shows that many calls to readdir(3) of the objects/pack/ > directory can cause significant overhead during 'gotadmin pack -a' > if the repository has many pack files that need to be searched. > > This patch caches the list of known pack index files when the > repository is opened. This speeds up searches by skipping readdir(3) > if the pack index cache cannot prevent us from hitting the disk. > > Pack files are not really expected to appear or disappear out > of the blue while the repository is in use, even though this > is possible. But even readdir(3) followed open(2) of an entry > could suffer from this. If this becomes a problem it was already > possible to hit with the old code, and could only be fixed by > catching relevant errors and retrying. > > ok? makes sense and the diff is almost straightforward. There's just a minor thing that caught my attention while reading the parts affected by the diff, but it was already there. If the repo allocation in got_repo_open fails the `goto done' ends up calling got_repo_close with repo == NULL and segfaulting there. ok? diff 43a61642701486d74f4bb5846d07295818e1150c /home/op/w/got (staged changes) blob - d4175545865a20ba9a3cff43736298aaae5e8c37 blob + e8a00ae9b1dab853c73f41060bc6ab6883e8e6f2 --- lib/repository.c +++ lib/repository.c @@ -664,10 +664,8 @@ got_repo_open(struct got_repository **repop, const cha return got_error_from_errno("getrlimit"); repo = calloc(1, sizeof(*repo)); - if (repo == NULL) { - err = got_error_from_errno("calloc"); - goto done; - } + if (repo == NULL) + return got_error_from_errno("calloc"); RB_INIT(&repo->packidx_bloom_filters);