"GOT", but the "O" is a cute, smiling pufferfish. Index | Thread | Search

From:
Stefan Sperling <stsp@stsp.name>
Subject:
Re: faster history traversal for 'got blame'
To:
Martin Pieuchot <mpi@openbsd.org>
Cc:
gameoftrees@openbsd.org
Date:
Tue, 7 Jan 2020 12:01:38 +0100

Download raw body.

Thread
On Tue, Jan 07, 2020 at 11:28:28AM +0100, Martin Pieuchot wrote:
> 'tog blame' is definitively faster with this.  But still unbearably slow
> (understand unusable) compared to 'tig blame'.
> 
> So I confirm the improvement.  Thanks :o)
> 
> Before applying the diff:
> 
> 	$ time got blame kern/kern_synch.c  > /dev/null
> 	    0m47.82s real     0m40.38s user     0m12.72s system
> 
> After:
> 
> 	$ time got blame kern/kern_synch.c  > /dev/null
> 	    0m34.26s real     0m31.13s user     0m03.02s system
> 
> Comparing with tig-2.4.1:
> 
> 	$ time git blame kern/kern_synch.c > /dev/null
> 	    0m09.42s real     0m08.59s user     0m00.60s system
> 
> 
> Not sure if reducing malloc/free and caching could help you reduce user
> time by a factor of 2 or 3, but that would be awesome :o)

got-read-pack is reasonably fast with this. I don't expect that more 
micro-optimizations will help much, though they should still be worth doing.

'got blame' is still slow on files with many revisions.
And file size also matters. Note how 'blame tog/tog.c' in the got repo
is slower than 'blame kern/kern_tc.c', even though the history of got
is much shorter than that of the OpenBSD src repo.

So I think the next problem we need to solve is that our diff code is slow.
Blame runs the entire file through diff for each commit that touched it.
Browsing sys/dev/pcidevs revision diffs with tog should give you an idea
about how slow diff can be. 'git diff' is super fast compared to our diff.

See the attached profile graph of the main process.
22% of 'blame kern/kern_synch.c' is in got_diffreg.
Another 18% is spent opening pack index files. This repository contains
14 pack files. Perhaps it would run faster with a single large pack.