"GOT", but the "O" is a cute, smiling pufferfish. Index | Thread | Search

From:
Mark Jamsek <mark@jamsek.com>
Subject:
Re: optimise reading the file index
To:
Stefan Sperling <stsp@stsp.name>
Cc:
"Todd C. Miller" <millert@openbsd.org>, Christian Weisgerber <naddy@mips.inka.de>, gameoftrees@openbsd.org
Date:
Fri, 28 Jul 2023 15:14:43 +1000

Download raw body.

Thread
Stefan Sperling <stsp@stsp.name> wrote:
> On Thu, Jul 27, 2023 at 09:07:13AM -0600, Todd C. Miller wrote:
> > On Thu, 27 Jul 2023 15:19:47 +0200, Christian Weisgerber wrote:
> > 
> > > The current code uses fread(3), i.e., buffered stdio, which translates
> > > to a long series of 16 kB read(2) system calls.
> > 
> > stdio uses the "optimal" buffer size reported for the device,
> > st_blksize from struct stat.  The buffer size can be modified via
> > a call to setvbuf().  The defaults were created when memory was
> > more dear so you may be able to improve things by just bumping the
> > buffer size.
> > 
> > > For a got status of /usr/src, 786 read(2) calls look rather negligible
> > > compared to 92,300+ stat(2) calls.
> > 
> > Yikes, that is a lot.
> 
> It seems to be about one stat call per file which can't get much better?
> 
> $ find /usr/src | wc -l
>    99998
> 
> In any case, this fileindex optimization isn't about making 'got status'
> run faster. We noticed a visible delay when starting up tog, where the
> newly added base-commit marker takes a short time to appear on some systems.
> I determined that the time was spent reading the file index, which prompted
> Mark to write this patch.

Yes, that's likely my fault: the 'got status' remark in my OP became an
unintended red herring but it was used to emphasise the point that
operations which depend on got_fileindex_read() calls such as the new
base commit marker in tog and the got branch -l change are already far
too quick on this machine because even 'got status' is fast on it. Which
is why I wanted wider testing besides what I could measure here.

As op taught me on irc, 'got info <path>' is good for this case, and
while we're seeing some improvement there, it doesn't seem to be showing
any improvement in the time it takes to draw the base commit marker in
tog. However, it is also making the new 'got branch -l' about 30% faster
and that takes the base commit of each file in the index into account
the same way tog's new base commit marker does, so there could be some
other reason in tog for the delay.


-- 
Mark Jamsek <https://bsdbox.org>
GPG: F2FF 13DE 6A06 C471 CA80  E6E2 2930 DC66 86EE CF68