From: Stefan Sperling Subject: Re: faster history traversal for 'got blame' To: Martin Pieuchot Cc: gameoftrees@openbsd.org Date: Tue, 7 Jan 2020 12:01:38 +0100 On Tue, Jan 07, 2020 at 11:28:28AM +0100, Martin Pieuchot wrote: > 'tog blame' is definitively faster with this. But still unbearably slow > (understand unusable) compared to 'tig blame'. > > So I confirm the improvement. Thanks :o) > > Before applying the diff: > > $ time got blame kern/kern_synch.c > /dev/null > 0m47.82s real 0m40.38s user 0m12.72s system > > After: > > $ time got blame kern/kern_synch.c > /dev/null > 0m34.26s real 0m31.13s user 0m03.02s system > > Comparing with tig-2.4.1: > > $ time git blame kern/kern_synch.c > /dev/null > 0m09.42s real 0m08.59s user 0m00.60s system > > > Not sure if reducing malloc/free and caching could help you reduce user > time by a factor of 2 or 3, but that would be awesome :o) got-read-pack is reasonably fast with this. I don't expect that more micro-optimizations will help much, though they should still be worth doing. 'got blame' is still slow on files with many revisions. And file size also matters. Note how 'blame tog/tog.c' in the got repo is slower than 'blame kern/kern_tc.c', even though the history of got is much shorter than that of the OpenBSD src repo. So I think the next problem we need to solve is that our diff code is slow. Blame runs the entire file through diff for each commit that touched it. Browsing sys/dev/pcidevs revision diffs with tog should give you an idea about how slow diff can be. 'git diff' is super fast compared to our diff. See the attached profile graph of the main process. 22% of 'blame kern/kern_synch.c' is in got_diffreg. Another 18% is spent opening pack index files. This repository contains 14 pack files. Perhaps it would run faster with a single large pack.