From: Stefan Sperling Subject: Re: refactor diffstat to compute diff once in got log -d and tog diff view To: Mark Jamsek Cc: Game of Trees Date: Mon, 16 Jan 2023 18:34:43 +0100 On Tue, Jan 17, 2023 at 01:13:20AM +1100, Mark Jamsek wrote: > The below diff is borne out of previous discussions with stsp about > eliminating the cost of computing the diff twice when displaying the > diffstat in 'got log -d' and tog diff views. > > It takes the same approach as the 'got diff -d' case where we use a temp > file for the diff so that we can compute the diffstat with the same > result from building the diff, and still draw the diffstat first once > the diff is complete. > > This has the additional benefit in tog (and got log -d vs got log -P -p) > of not running the get_changed_paths() -> got_diff_tree_collect_changed_paths() > code path to get the changeset before computing the diff (which > traversed much the same code twice). It also results in net less code :) > > From a performance perspective: at the extreme end, if you load tog with > b5500b9ca0102f1c in OpenBSD src.git with and without this patch you will > notice the difference. For the average diff it's not nearly as > noticeable but is an improvement nonetheless. That's great, thanks! I don't have time to review the diff in detail right now, but please proceed. > I've also taken the liberty of making 'got log -d' behave like 'got diff > -d' such that the actual diff is displayed too. stsp made the point when > first reviewing got diff -d code that the diffstat per se isn't all that > meaningful--it's most useful in the context of the diff. Regress is > passing, but assuming this is ok, I'll follow up with a change to the > test_log_diffstat regress to include testing for diff output. Otherwise > I can revert this part of the diff and stick to just displaying the > diffstat with 'got log -d', in which case the current regress provides > sufficient coverage. I would argue that a log message already provides good enough context for a diffstat. A commit log message which implies a large change but comes with a tiny-looking diffstat would stand out, and vice versa. The 'got diff -d' case was different since it lacked context entirely. And users are already able to quickly combine both options as "diff -dp" if desired. I see at least one possible use case for -d without -p style output: Consider post-commit email notifications, where a log message with a diffstat attached would provide a suitable amount of information, and won't blow up peoples' inboxes when someone imports a new version of a large vendor-imported subtree such as gcc or LLVM. Granted, this use case is a bit contrived right now (gotd has no hook scripts, and people running git-daemon would likely be using 'git log' to generate an email body). But it shows that 'log -d' by itself can be useful.