"GOT", but the "O" is a cute, smiling pufferfish. Index | Thread | Search

From:
Omar Polo <op@omarpolo.com>
Subject:
Re: draft: keeping authorship of diffs in commits
To:
Stefan Sperling <stsp@stsp.name>
Cc:
gameoftrees@openbsd.org
Date:
Sun, 06 Feb 2022 11:20:55 +0100

Download raw body.

Thread
Stefan Sperling <stsp@stsp.name> writes:

> On Thu, Feb 03, 2022 at 01:45:49PM +0100, Omar Polo wrote:
>> Hello,
>> 
>> Although it's not really used on OpenBSD (because one lists OKs and such
>> in the commit message), when working on personal projects I'd like to
>> keep the authorship of diffs I receive.
>> 
>> With git the usual workflow is to receive a git-formatted patch
>> (e.g. via git send-email) and apply it locally with git-am.  git-am is
>> able to extract the author of the patch from the email and use it in the
>> commit.  That's the "from" and "via" fields showed by tog.
>> 
>> Got only allows to set the author of the commit (the "from" field in
>> tog) but not the committer of the diff.  In the past I've set GOT_AUTHOR
>> before committing to the username and email of the author of the diff.
>> While viable, that doesn't set the committer ("via" field in tog).  OK,
>> it's a bit of an aesthetic reason, but I'd like to being able to keep
>> the authorship and still be listed as the committer of a diff.
>
> The "traditional" way of doing this is to attribute authors in
> the log message via "patch by" mentions and such. I have been
> doing it this way and simply ignored Git's dedicated meta-data
> field in commit objects. But, of course, it makes sense to allow
> people to use this feature.
>
> One problem is that this field is not free form. It requires an
> email address. I think Git might fail to parse the commit object
> otherwise. I would prefer a free-form field but we cannot use it
> as such due to compatibility concerns. I see you have addressed
> this problem in your patch, which is good.
>
> Another problem I see is that in some places people might expect to
> see a committer's name instead of the author's name. For example,
> if an OpenBSD developer opens tog on the OpenBSD repo they might
> not expect non-committer handles to be listed next to commits.
> But we can adjust 'tog log' to display the committer field instead
> of the author field.

I admit that I haven't thought in this detail the issue and agree on all
the issues.

There's also another one issue I think: nomenclature.  got(1) mention
various time the "commit author", the env variable is called GOT_AUTHOR
and the got.conf setting is `author'.  Adding a "-A author" to got ci
which does a subtly different thing can easily confuse people.

>> Diff below is a draft for this.
>> 
>> It lacks several things (manpage bits, testing, probably more) and it
>> also produces the wrong error message if -A has a wrong format, but it's
>> just to gauge the interest and see if it's worth going forward.  It adds
>> an -A flag for got commit which is then used to set the author of the
>> commit.  The committer is still chosen the same way as always (got.conf,
>> GOT_AUTHOR, ...)

(regarding the error message I noticed that it is a separate issue)

> Yes, I guess adding such an option makes sense.
> I don't have a better idea.
>
> Could you send a more complete diff?

Here's a more complete one which adds the manpage bits and a test case
too.  However, given your concerns I'm not so sure anymore!  I wanted
something like this to avoid having to switch back to `git am'
temporally when committing diffs and I thought it would be useful to
others too (AFAIK freebsd uses this approach at least for ports commits)
but the UI issues are real.

diff 4cd2c1745cda66c359c86bea412d77af0b643a7e /home/op/w/got
blob - eaafde8411ebca559ac74b4970f3a9ee755e26b9
file + got/got.1
--- got/got.1
+++ got/got.1
@@ -1337,7 +1337,7 @@ will refuse to run if a specified
 is a directory.
 .El
 .Tg ci
-.It Cm commit Oo Fl F Ar path Oc Oo Fl m Ar message Oc Oo Fl N Oc Oo Fl S Oc Op Ar path ...
+.It Cm commit Oo Fl A Ar author Oc Oo Fl F Ar path Oc Oo Fl m Ar message Oc Oo Fl N Oc Oo Fl S Oc Op Ar path ...
 .Dl Pq alias: Cm ci
 Create a new commit in the repository from changes in a work tree
 and use this commit as the new base commit for the work tree.
@@ -1402,6 +1402,16 @@ The options for
 .Cm got commit
 are as follows:
 .Bl -tag -width Ds
+.It Fl A Ar author
+Specify the
+.Ar author
+of the commit to the specified one.
+If set,
+.Nm
+will use the provided
+.Ar author
+for the commit and set the current user as the
+.Dq committer .
 .It Fl F Ar path
 Use the prepared log message stored in the file found at
 .Ar path
blob - 16b939dba60d2210e6c75b2ffbb51a4b1f8d608a
file + got/got.c
--- got/got.c
+++ got/got.c
@@ -597,6 +597,28 @@ import_progress(void *arg, const char *path)
 }
 
 static const struct got_error *
+validate_author(const char *author)
+{
+	/*
+	 * Really dumb email address check; we're only doing this to
+	 * avoid git's object parser breaking on commits we create.
+	 */
+	while (*author && *author != '<')
+		author++;
+	if (*author != '<')
+		return got_error(GOT_ERR_COMMIT_NO_EMAIL);
+	while (*author && *author != '@')
+		author++;
+	if (*author != '@')
+		return got_error(GOT_ERR_COMMIT_NO_EMAIL);
+	while (*author && *author != '>')
+		author++;
+	if (*author != '>')
+		return got_error(GOT_ERR_COMMIT_NO_EMAIL);
+	return NULL;
+}
+
+static const struct got_error *
 get_author(char **author, struct got_repository *repo,
     struct got_worktree *worktree)
 {
@@ -653,27 +675,7 @@ get_author(char **author, struct got_repository *repo,
 	if (*author == NULL)
 		return got_error_from_errno("strdup");
 
-	/*
-	 * Really dumb email address check; we're only doing this to
-	 * avoid git's object parser breaking on commits we create.
-	 */
-	while (*got_author && *got_author != '<')
-		got_author++;
-	if (*got_author != '<') {
-		err = got_error(GOT_ERR_COMMIT_NO_EMAIL);
-		goto done;
-	}
-	while (*got_author && *got_author != '@')
-		got_author++;
-	if (*got_author != '@') {
-		err = got_error(GOT_ERR_COMMIT_NO_EMAIL);
-		goto done;
-	}
-	while (*got_author && *got_author != '>')
-		got_author++;
-	if (*got_author != '>')
-		err = got_error(GOT_ERR_COMMIT_NO_EMAIL);
-done:
+	err = validate_author(got_author);
 	if (err) {
 		free(*author);
 		*author = NULL;
@@ -7368,8 +7370,8 @@ done:
 __dead static void
 usage_commit(void)
 {
-	fprintf(stderr, "usage: %s commit [-F path] [-m msg] [-N] [-S] "
-	    "[path ...]\n", getprogname());
+	fprintf(stderr, "usage: %s commit [-A author] [-F path] [-m msg] [-N] "
+	    "[-S] [path ...]\n", getprogname());
 	exit(1);
 }
 
@@ -7528,8 +7530,9 @@ cmd_commit(int argc, char *argv[])
 	struct got_object_id *id = NULL;
 	const char *logmsg = NULL;
 	char *prepared_logmsg = NULL;
+	const char *author = NULL;
 	struct collect_commit_logmsg_arg cl_arg;
-	char *gitconfig_path = NULL, *editor = NULL, *author = NULL;
+	char *gitconfig_path = NULL, *editor = NULL, *committer = NULL;
 	int ch, rebase_in_progress, histedit_in_progress, preserve_logmsg = 0;
 	int allow_bad_symlinks = 0, non_interactive = 0, merge_in_progress = 0;
 	struct got_pathlist_head paths;
@@ -7537,8 +7540,13 @@ cmd_commit(int argc, char *argv[])
 	TAILQ_INIT(&paths);
 	cl_arg.logmsg_path = NULL;
 
-	while ((ch = getopt(argc, argv, "F:m:NS")) != -1) {
+	while ((ch = getopt(argc, argv, "A:F:m:NS")) != -1) {
 		switch (ch) {
+		case 'A':
+			author = optarg;
+			if ((error = validate_author(author)) != NULL)
+				return error;
+			break;
 		case 'F':
 			if (logmsg != NULL)
 				option_conflict('F', 'm');
@@ -7613,9 +7621,11 @@ cmd_commit(int argc, char *argv[])
 		goto done;
 	}
 
-	error = get_author(&author, repo, worktree);
+	error = get_author(&committer, repo, worktree);
 	if (error)
 		return error;
+	if (author == NULL)
+		author = committer;
 
 	/*
 	 * unveil(2) traverses exec(2); if an editor is used we have
@@ -7647,7 +7657,7 @@ cmd_commit(int argc, char *argv[])
 		cl_arg.branch_name += 11;
 	}
 	cl_arg.repo_path = got_repo_get_path(repo);
-	error = got_worktree_commit(&id, worktree, &paths, author, NULL,
+	error = got_worktree_commit(&id, worktree, &paths, author, committer,
 	    allow_bad_symlinks, collect_commit_logmsg, &cl_arg,
 	    print_status, NULL, repo);
 	if (error) {
@@ -7680,7 +7690,7 @@ done:
 	free(id_str);
 	free(gitconfig_path);
 	free(editor);
-	free(author);
+	free(committer);
 	free(prepared_logmsg);
 	return error;
 }
blob - 34d74d26fefb2a4461a2c489cade5dcd7ebfcd66
file + regress/cmdline/commit.sh
--- regress/cmdline/commit.sh
+++ regress/cmdline/commit.sh
@@ -700,6 +700,46 @@ test_commit_tree_entry_sorting() {
 	test_done "$testroot" "$ret"
 }
 
+test_commit_cmdline_author() {
+	local testroot=`test_init commit_cmdline_author`
+
+	got checkout $testroot/repo $testroot/wt > /dev/null
+	ret=$?
+	if [ "$ret" != "0" ]; then
+		test_done "$testroot" "$ret"
+		return 1
+	fi
+
+	author="Foo <foo@example.com>"
+	committer="Flan Luck <flan_luck@openbsd.org>"
+
+	echo "modified alpha" > $testroot/wt/alpha
+	(cd $testroot/wt && \
+		 GOT_AUTHOR="$committer" got commit -A "$author" -m 'from foo' \
+			   >/dev/null)
+	ret=$?
+	if [ $ret != 0 ]; then
+		test_done "$testroot" "$ret"
+		return 1
+	fi
+
+	(cd $testroot/repo && got log -l1 | egrep '^(from|via):' > $testroot/stdout)
+	ret=$?
+	if [ $ret != 0 ]; then
+		test_done "$testroot" "$ret"
+		return 1
+	fi
+
+	echo "from: $author" > $testroot/stdout.expected
+	echo "via: $committer" >> $testroot/stdout.expected
+	cmp -s $testroot/stdout.expected $testroot/stdout
+	ret=$?
+	if [ $ret != 0 ]; then
+		diff -u $testroot/stdout.expected $testroot/stdout
+	fi
+	test_done "$testroot" "$ret"
+}
+
 test_commit_gotconfig_author() {
 	local testroot=`test_init commit_gotconfig_author`
 
@@ -1523,6 +1563,7 @@ run_test test_commit_selected_paths
 run_test test_commit_outside_refs_heads
 run_test test_commit_no_email
 run_test test_commit_tree_entry_sorting
+run_test test_commit_cmdline_author
 run_test test_commit_gotconfig_author
 run_test test_commit_gotconfig_worktree_author
 run_test test_commit_gitconfig_author