Download raw body.
draft: keeping authorship of diffs in commits
Omar Polo <op@omarpolo.com> wrote: > Stefan Sperling <stsp@stsp.name> wrote: > > On Sun, Feb 06, 2022 at 02:16:39PM +0100, Stefan Sperling wrote: > > > On Sun, Feb 06, 2022 at 11:20:55AM +0100, Omar Polo wrote: > > > > There's also another one issue I think: nomenclature. got(1) mention > > > > various time the "commit author", the env variable is called GOT_AUTHOR > > > > and the got.conf setting is `author'. Adding a "-A author" to got ci > > > > which does a subtly different thing can easily confuse people. > > > > > > Yes, in hindsight this is bad, because it makes it harder to make > > > a clear distinction between committer and author. This happened > > > because I did not consider this distinction important enough, as > > > this is no problem with the way CVS is being used today. However, > > > expectations may change with a new repository format that is able > > > to store additional meta-data. > > > > > > The good news is that we can change anything! At this stage, breaking > > > things in incompatible ways is absolutely fine. Anybody who is using > > > Got nowadays must be ready to adjust their setup to changes we make > > > at any moment. Release numbers start with a zero for a reason and I > > > do not expect this to change anytime soon. > > > > > > Should we just rename all these things to "committer" or is there > > > a better path forward? > > > > Going back to this: I tought about this some more and I don't think > > it is a big problem. Usually, the person who creates a new commit > > is the author, so the names we have for configuration items are fine. > > > > We can add specific options to override the "author" in commit objects, > > and simply document that the default value for this comes from GOT_AUTHOR > > or the config file, and document that this default value will always > > be used for the "committer" field. This should be clear enough. > > I thought a bit about it too and i've come to agree to this. GOT_AUTHOR > is more clear than GOT_COMMITTER IMHO, and the case of "author != > committer" can be considered an exception to the usual rule IMHO. > > Below there's a rebased (and slightly improved) diff for got commit -A > > > Your patch is a good start. It satisfies the use case of patch contributions. > > Of course, since this requires action on part of the committer, there will > > always be commit objects created from patches which lack attribution of the > > author in the commit object's author field, which means this "meta-data" will > > not be very reliable. Not much we can do about that. > > I don't really like the idea of complicating 'got patch' to the point where > > it tries to figure out author information from patch files. We would need to > > encourage people to save patches as an mbox file instead of saving only the > > body of an email, which is just as unlikely to work as consistently adding > > an option like -A to 'got commit' command lines. > > One of my initial goal for 'got patch' was also for it to be able to > handle the common case of applying patches from email preserving the > author. However, it kept falling down as priority for myself, to the > point of being completely dropped from my mental todo-list. The more I > use got, the more I dislike how git implicitly creates commit for > various actions, and I'd like for 'git patch' to avoid doing so. I > prefer doing a 'got patch', maybe tweak something, and then commit it. > (admittedly, I don't receive tons of patches to commit every day :-) > > (maybe we could envision something where got patch put us in a situation > similar to 'edit' in a histedit script with the changes applied to the > worktree and a default commit message pre-filled on the next commit? > doesn't seem easy to do in an obvious way for the user though.) > > > But because this would interop with output of git-format-patch, perhaps some > > people would use such a feature? If we added this to 'got patch' we should > > probably also extend 'got diff' to produce compatible output, at least as > > an option? > > Maybe some sort of output format that's more compatible with git could > be useful. not sure if we really need/want it, but it happened to me in > the past that I confused some upstream maintainers when sending patches > generated with `got diff'. > > (in at least one case part of the issue was also the default behaviour > of GNU' patch i guess) > > > Additionally, it would be nice if 'got rebase' and 'got histedit' would > > preserve the original commit's author field as "author", while the > > "committer" field would be reset to the value of GOT_AUTHOR or the > > config file. Once we do this, I think the distinction will become more > > or less self-evident for people who are familiar with these commands. > > Don't remember if git does the same, but it seems intuitive and (for > what it's worth) i agree with it at least for `got histedit', not so > sure about rebase still. > > --- > > Here's the old diff rebased and tweaked a bit. rebased again after recent change to valid_author; now it's a bit smaller. (i'm still trying to come up with a description that i like for the manpage) diff /home/op/w/got commit - 314c3f148b7679296136b460ea9a8f0d4c74d437 path + /home/op/w/got blob - e40b2fea7364f818d1cc038e58b39c5e8b17968b file + got/got.1 --- got/got.1 +++ got/got.1 @@ -1476,7 +1476,7 @@ will refuse to run if a specified is a directory. .El .Tg ci -.It Cm commit Oo Fl F Ar path Oc Oo Fl m Ar message Oc Oo Fl N Oc Oo Fl S Oc Op Ar path ... +.It Cm commit Oo Fl A Ar author Oc Oo Fl F Ar path Oc Oo Fl m Ar message Oc Oo Fl N Oc Oo Fl S Oc Op Ar path ... .Dl Pq alias: Cm ci Create a new commit in the repository from changes in a work tree and use this commit as the new base commit for the work tree. @@ -1541,6 +1541,13 @@ The options for .Cm got commit are as follows: .Bl -tag -width Ds +.It Fl A Ar author +Specify the +.Ar author +of the commit with the same format of +.Ev GOT_AUTHOR . +The user doing the commit is then acknowledged as the +.Dq committer . .It Fl F Ar path Use the prepared log message stored in the file found at .Ar path blob - fc3f426e58ce1bbe4245627188786b1b9227e1c0 file + got/got.c --- got/got.c +++ got/got.c @@ -8319,8 +8319,8 @@ done: __dead static void usage_commit(void) { - fprintf(stderr, "usage: %s commit [-F path] [-m msg] [-N] [-S] " - "[path ...]\n", getprogname()); + fprintf(stderr, "usage: %s commit [-A author] [-F path] [-m msg] " + "[-N] [-S] [path ...]\n", getprogname()); exit(1); } @@ -8480,7 +8480,8 @@ cmd_commit(int argc, char *argv[]) const char *logmsg = NULL; char *prepared_logmsg = NULL; struct collect_commit_logmsg_arg cl_arg; - char *gitconfig_path = NULL, *editor = NULL, *author = NULL; + const char *author = NULL; + char *gitconfig_path = NULL, *editor = NULL, *committer = NULL; int ch, rebase_in_progress, histedit_in_progress, preserve_logmsg = 0; int allow_bad_symlinks = 0, non_interactive = 0, merge_in_progress = 0; struct got_pathlist_head paths; @@ -8489,8 +8490,13 @@ cmd_commit(int argc, char *argv[]) TAILQ_INIT(&paths); cl_arg.logmsg_path = NULL; - while ((ch = getopt(argc, argv, "F:m:NS")) != -1) { + while ((ch = getopt(argc, argv, "A:F:m:NS")) != -1) { switch (ch) { + case 'A': + author = optarg; + if ((error = valid_author(author)) != NULL) + return error; + break; case 'F': if (logmsg != NULL) option_conflict('F', 'm'); @@ -8570,10 +8576,13 @@ cmd_commit(int argc, char *argv[]) goto done; } - error = get_author(&author, repo, worktree); + error = get_author(&committer, repo, worktree); if (error) return error; + if (author == NULL) + author = committer; + /* * unveil(2) traverses exec(2); if an editor is used we have * to apply unveil after the log message has been written. @@ -8604,7 +8613,7 @@ cmd_commit(int argc, char *argv[]) cl_arg.branch_name += 11; } cl_arg.repo_path = got_repo_get_path(repo); - error = got_worktree_commit(&id, worktree, &paths, author, NULL, + error = got_worktree_commit(&id, worktree, &paths, author, committer, allow_bad_symlinks, collect_commit_logmsg, &cl_arg, print_status, NULL, repo); if (error) { @@ -8643,7 +8652,7 @@ done: free(id_str); free(gitconfig_path); free(editor); - free(author); + free(committer); free(prepared_logmsg); return error; } blob - 74b1300abdc8bad6089c22251a9efd9642490d49 file + regress/cmdline/commit.sh --- regress/cmdline/commit.sh +++ regress/cmdline/commit.sh @@ -752,6 +752,46 @@ test_commit_tree_entry_sorting() { test_done "$testroot" "$ret" } +test_commit_cmdline_author() { + local testroot=`test_init commit_cmdline_author` + + got checkout $testroot/repo $testroot/wt > /dev/null + ret=$? + if [ $ret -ne 0 ]; then + test_done "$testroot" $ret + return 1 + fi + + author="Foo <foo@example.com>" + committer="Flan Hacker <flan_hacker@openbsd.org>" + + echo "modified alpha" > $testroot/wt/alpha + (cd $testroot/wt && got commit -A "$author" -m 'from foo') \ + > /dev/null + ret=$? + if [ $ret -ne 0 ]; then + test_done "$testroot" $ret + return 1 + fi + + (cd $testroot/repo && got log -l1 | egrep '^(from|via):') \ + > $testroot/stdout + ret=$? + if [ $ret -ne 0 ]; then + test_done "$testroot" $ret + return 1 + fi + + echo "from: $author" > $testroot/stdout.expected + echo "via: $committer" >> $testroot/stdout.expected + cmp -s $testroot/stdout.expected $testroot/stdout + ret=$? + if [ $ret -ne 0 ]; then + diff -u $testroot/stdout.expected $testroot/stdout + fi + test_done "$testroot" $ret +} + test_commit_gotconfig_author() { local testroot=`test_init commit_gotconfig_author` @@ -1575,6 +1615,7 @@ run_test test_commit_selected_paths run_test test_commit_outside_refs_heads run_test test_commit_no_email run_test test_commit_tree_entry_sorting +run_test test_commit_cmdline_author run_test test_commit_gotconfig_author run_test test_commit_gotconfig_worktree_author run_test test_commit_gitconfig_author
draft: keeping authorship of diffs in commits