"GOT", but the "O" is a cute, smiling pufferfish. Index | Thread | Search

From:
Omar Polo <op@omarpolo.com>
Subject:
Re: draft: keeping authorship of diffs in commits
To:
Stefan Sperling <stsp@stsp.name>
Cc:
gameoftrees@openbsd.org
Date:
Tue, 19 Jul 2022 15:31:44 +0200

Download raw body.

Thread
Omar Polo <op@omarpolo.com> wrote:
> Stefan Sperling <stsp@stsp.name> wrote:
> > On Sun, Feb 06, 2022 at 02:16:39PM +0100, Stefan Sperling wrote:
> > > On Sun, Feb 06, 2022 at 11:20:55AM +0100, Omar Polo wrote:
> > > > There's also another one issue I think: nomenclature.  got(1) mention
> > > > various time the "commit author", the env variable is called GOT_AUTHOR
> > > > and the got.conf setting is `author'.  Adding a "-A author" to got ci
> > > > which does a subtly different thing can easily confuse people.
> > > 
> > > Yes, in hindsight this is bad, because it makes it harder to make
> > > a clear distinction between committer and author. This happened
> > > because I did not consider this distinction important enough, as
> > > this is no problem with the way CVS is being used today. However,
> > > expectations may change with a new repository format that is able
> > > to store additional meta-data.
> > > 
> > > The good news is that we can change anything! At this stage, breaking
> > > things in incompatible ways is absolutely fine. Anybody who is using
> > > Got nowadays must be ready to adjust their setup to changes we make
> > > at any moment. Release numbers start with a zero for a reason and I
> > > do not expect this to change anytime soon.
> > > 
> > > Should we just rename all these things to "committer" or is there
> > > a better path forward?
> > 
> > Going back to this: I tought about this some more and I don't think
> > it is a big problem. Usually, the person who creates a new commit
> > is the author, so the names we have for configuration items are fine.
> > 
> > We can add specific options to override the "author" in commit objects,
> > and simply document that the default value for this comes from GOT_AUTHOR
> > or the config file, and document that this default value will always
> > be used for the "committer" field. This should be clear enough.
> 
> I thought a bit about it too and i've come to agree to this.  GOT_AUTHOR
> is more clear than GOT_COMMITTER IMHO, and the case of "author !=
> committer" can be considered an exception to the usual rule IMHO.
> 
> Below there's a rebased (and slightly improved) diff for got commit -A
> 
> > Your patch is a good start. It satisfies the use case of patch contributions.
> > Of course, since this requires action on part of the committer, there will
> > always be commit objects created from patches which lack attribution of the
> > author in the commit object's author field, which means this "meta-data" will
> > not be very reliable. Not much we can do about that.
> > I don't really like the idea of complicating 'got patch' to the point where
> > it tries to figure out author information from patch files. We would need to
> > encourage people to save patches as an mbox file instead of saving only the
> > body of an email, which is just as unlikely to work as consistently adding
> > an option like -A to 'got commit' command lines.
> 
> One of my initial goal for 'got patch' was also for it to be able to
> handle the common case of applying patches from email preserving the
> author.  However, it kept falling down as priority for myself, to the
> point of being completely dropped from my mental todo-list.  The more I
> use got, the more I dislike how git implicitly creates commit for
> various actions, and I'd like for 'git patch' to avoid doing so.  I
> prefer doing a 'got patch', maybe tweak something, and then commit it.
> (admittedly, I don't receive tons of patches to commit every day :-)
> 
> (maybe we could envision something where got patch put us in a situation
> similar to 'edit' in a histedit script with the changes applied to the
> worktree and a default commit message pre-filled on the next commit?
> doesn't seem easy to do in an obvious way for the user though.)
> 
> > But because this would interop with output of git-format-patch, perhaps some
> > people would use such a feature? If we added this to 'got patch' we should
> > probably also extend 'got diff' to produce compatible output, at least as
> > an option?
> 
> Maybe some sort of output format that's more compatible with git could
> be useful.  not sure if we really need/want it, but it happened to me in
> the past that I confused some upstream maintainers when sending patches
> generated with `got diff'.
> 
> (in at least one case part of the issue was also the default behaviour
> of GNU' patch i guess)
> 
> > Additionally, it would be nice if 'got rebase' and 'got histedit' would
> > preserve the original commit's author field as "author", while the
> > "committer" field would be reset to the value of GOT_AUTHOR or the
> > config file. Once we do this, I think the distinction will become more
> > or less self-evident for people who are familiar with these commands.
> 
> Don't remember if git does the same, but it seems intuitive and (for
> what it's worth) i agree with it at least for `got histedit', not so
> sure about rebase still.
> 
>  ---
> 
> Here's the old diff rebased and tweaked a bit.

rebased again after recent change to valid_author; now it's a bit
smaller.

(i'm still trying to come up with a description that i like for the
manpage)

diff /home/op/w/got
commit - 314c3f148b7679296136b460ea9a8f0d4c74d437
path + /home/op/w/got
blob - e40b2fea7364f818d1cc038e58b39c5e8b17968b
file + got/got.1
--- got/got.1
+++ got/got.1
@@ -1476,7 +1476,7 @@ will refuse to run if a specified
 is a directory.
 .El
 .Tg ci
-.It Cm commit Oo Fl F Ar path Oc Oo Fl m Ar message Oc Oo Fl N Oc Oo Fl S Oc Op Ar path ...
+.It Cm commit Oo Fl A Ar author Oc Oo Fl F Ar path Oc Oo Fl m Ar message Oc Oo Fl N Oc Oo Fl S Oc Op Ar path ...
 .Dl Pq alias: Cm ci
 Create a new commit in the repository from changes in a work tree
 and use this commit as the new base commit for the work tree.
@@ -1541,6 +1541,13 @@ The options for
 .Cm got commit
 are as follows:
 .Bl -tag -width Ds
+.It Fl A Ar author
+Specify the
+.Ar author
+of the commit with the same format of
+.Ev GOT_AUTHOR .
+The user doing the commit is then acknowledged as the
+.Dq committer .
 .It Fl F Ar path
 Use the prepared log message stored in the file found at
 .Ar path
blob - fc3f426e58ce1bbe4245627188786b1b9227e1c0
file + got/got.c
--- got/got.c
+++ got/got.c
@@ -8319,8 +8319,8 @@ done:
 __dead static void
 usage_commit(void)
 {
-	fprintf(stderr, "usage: %s commit [-F path] [-m msg] [-N] [-S] "
-	    "[path ...]\n", getprogname());
+	fprintf(stderr, "usage: %s commit [-A author] [-F path] [-m msg] "
+	    "[-N] [-S] [path ...]\n", getprogname());
 	exit(1);
 }
 
@@ -8480,7 +8480,8 @@ cmd_commit(int argc, char *argv[])
 	const char *logmsg = NULL;
 	char *prepared_logmsg = NULL;
 	struct collect_commit_logmsg_arg cl_arg;
-	char *gitconfig_path = NULL, *editor = NULL, *author = NULL;
+	const char *author = NULL;
+	char *gitconfig_path = NULL, *editor = NULL, *committer = NULL;
 	int ch, rebase_in_progress, histedit_in_progress, preserve_logmsg = 0;
 	int allow_bad_symlinks = 0, non_interactive = 0, merge_in_progress = 0;
 	struct got_pathlist_head paths;
@@ -8489,8 +8490,13 @@ cmd_commit(int argc, char *argv[])
 	TAILQ_INIT(&paths);
 	cl_arg.logmsg_path = NULL;
 
-	while ((ch = getopt(argc, argv, "F:m:NS")) != -1) {
+	while ((ch = getopt(argc, argv, "A:F:m:NS")) != -1) {
 		switch (ch) {
+		case 'A':
+			author = optarg;
+			if ((error = valid_author(author)) != NULL)
+				return error;
+			break;
 		case 'F':
 			if (logmsg != NULL)
 				option_conflict('F', 'm');
@@ -8570,10 +8576,13 @@ cmd_commit(int argc, char *argv[])
 		goto done;
 	}
 
-	error = get_author(&author, repo, worktree);
+	error = get_author(&committer, repo, worktree);
 	if (error)
 		return error;
 
+	if (author == NULL)
+		author = committer;
+
 	/*
 	 * unveil(2) traverses exec(2); if an editor is used we have
 	 * to apply unveil after the log message has been written.
@@ -8604,7 +8613,7 @@ cmd_commit(int argc, char *argv[])
 		cl_arg.branch_name += 11;
 	}
 	cl_arg.repo_path = got_repo_get_path(repo);
-	error = got_worktree_commit(&id, worktree, &paths, author, NULL,
+	error = got_worktree_commit(&id, worktree, &paths, author, committer,
 	    allow_bad_symlinks, collect_commit_logmsg, &cl_arg,
 	    print_status, NULL, repo);
 	if (error) {
@@ -8643,7 +8652,7 @@ done:
 	free(id_str);
 	free(gitconfig_path);
 	free(editor);
-	free(author);
+	free(committer);
 	free(prepared_logmsg);
 	return error;
 }
blob - 74b1300abdc8bad6089c22251a9efd9642490d49
file + regress/cmdline/commit.sh
--- regress/cmdline/commit.sh
+++ regress/cmdline/commit.sh
@@ -752,6 +752,46 @@ test_commit_tree_entry_sorting() {
 	test_done "$testroot" "$ret"
 }
 
+test_commit_cmdline_author() {
+	local testroot=`test_init commit_cmdline_author`
+
+	got checkout $testroot/repo $testroot/wt > /dev/null
+	ret=$?
+	if [ $ret -ne 0 ]; then
+		test_done "$testroot" $ret
+		return 1
+	fi
+
+	author="Foo <foo@example.com>"
+	committer="Flan Hacker <flan_hacker@openbsd.org>"
+
+	echo "modified alpha" > $testroot/wt/alpha
+	(cd $testroot/wt && got commit -A "$author" -m 'from foo') \
+		> /dev/null
+	ret=$?
+	if [ $ret -ne 0 ]; then
+		test_done "$testroot" $ret
+		return 1
+	fi
+
+	(cd $testroot/repo && got log -l1 | egrep '^(from|via):') \
+		> $testroot/stdout
+	ret=$?
+	if [ $ret -ne 0 ]; then
+		test_done "$testroot" $ret
+		return 1
+	fi
+
+	echo "from: $author" > $testroot/stdout.expected
+	echo "via: $committer" >> $testroot/stdout.expected
+	cmp -s $testroot/stdout.expected $testroot/stdout
+	ret=$?
+	if [ $ret -ne 0 ]; then
+		diff -u $testroot/stdout.expected $testroot/stdout
+	fi
+	test_done "$testroot" $ret
+}
+
 test_commit_gotconfig_author() {
 	local testroot=`test_init commit_gotconfig_author`
 
@@ -1575,6 +1615,7 @@ run_test test_commit_selected_paths
 run_test test_commit_outside_refs_heads
 run_test test_commit_no_email
 run_test test_commit_tree_entry_sorting
+run_test test_commit_cmdline_author
 run_test test_commit_gotconfig_author
 run_test test_commit_gotconfig_worktree_author
 run_test test_commit_gitconfig_author