"GOT", but the "O" is a cute, smiling pufferfish. Index | Thread | Search

From:
Mark Jamsek <mark@jamsek.com>
Subject:
Re: tog: keymaps to navigate to prev/next file/hunk in the diff
To:
gameoftrees@openbsd.org
Date:
Wed, 27 Jul 2022 02:16:54 +1000

Download raw body.

Thread
On 22-07-26 02:24pm, Stefan Sperling wrote:
> On Tue, Jul 26, 2022 at 01:51:02AM +1000, Mark Jamsek wrote:
> > I think that would be a lot better! Tbh, my first instinct was that this
> > should be done in the diff code because we already know each line type
> > as the diff is produced, but I wasn't sure how the proposed key maps
> > would be received so wanted to get a feel first.
> > 
> > I'm happy to hack on the diff code. We could use a bitarray or even
> > build an index of each line as we save offsets so we'd have the type of
> > each line in the diff. Then we could do away with the regex for colours
> > as we'd have each meta/hunk/' '/+/- line indexed. What do you think? Or
> > would you prefer we just have another array for hunk offsets? I don't
> > mind either way.
> 
> Sounds good. The more meta-data we can get from the diff library
> "for free", the better. Callers should ideally not have to munge
> the data again to figure out what is where.

I agree! I'll start on this tomorrow.

I had a little time tonight so thought I'd take some measurements first
to get an idea of the impact any changes have on performance, and was
surprised with the results.

I instrumented create_diff() from main [595228385f], my local ezdiffnav
branch with the changes in the above diff, and version [2bea860227] with
the get_filestream_info() function stsp referenced.  I invoked `tog
diff` using the patience algorithm on a commit of 675,402 lines (23MB),
and also diffed the same commit via the log view with `tog log`.

`tog diff` and the diff loaded via `tog log` from both main and my local
branch are comparable (+/- 0.05), but 2bea860227 takes 2x (tog diff) and
3x (tog log) as long as the others!

I still think this change should be done in the lib when building the
diff because we can map each line to its type as we save each line
offset thus we collect more data and the second pass isn't required; but
it really surprised me.  I haven't looked at the code yet, and I'm
assuming there have been other changes made to the diff since
2bea860227--the difference can't solely be attributed to
get_filestream_info()--but I didn't expect main and ezdiffnav to be
roughly the same. I might profile tog tomorrow for a better picture.

tog log (avg)
main: 5269.42 ms (cpu) 6381.70 ms (wall)
ezdiffnav: 5588.65 ms (cpu) 6799.38 ms (wall)
2bea860227: 19337.42 ms (cpu) 20214.06 ms (wall)

tog diff (avg)
main: 5229.16 ms (cpu) 6332.12 ms (wall)
ezdiffnav: 5323.60 ms (cpu) 6506.54 ms (wall)
2bea860227: 11445.84 ms (cpu) 12231.52 ms (wall)

-- 
Mark Jamsek <fnc.bsdbox.org>
GPG: F2FF 13DE 6A06 C471 CA80  E6E2 2930 DC66 86EE CF68