From: Sebastien Marie Subject: patch: generic commit header management in 'got log' To: gameoftrees@openbsd.org Date: Sat, 28 Sep 2019 10:43:41 +0200 Hi, Some time ago, I discussed with stsp@ about some odd output with log on tryton repositories. To quote myself: > I just cloned and checkout a mercurial mirror (on github), and I have > odd log output. > > $ git clone --bare https://github.com/tryton/sao > # sao is a relatively small mirror for testing (8.3Mo) > $ got checkout sao.git > $ cd sao > $ tog > commit 95a150b9656dac2f13035784880138f52f6cf7a5 [1/27] develop > 19/09/07 ced HG:rename-source hg > 19/09/07 ced HG:rename-source hg > 19/09/07 ced HG:rename-source hg > 19/09/07 ced HG:rename-source hg > 19/09/07 ced HG:rename-source hg > 19/09/07 ced HG:rename-source hg > 19/09/07 ced HG:rename-source hg > 19/09/07 ced HG:rename-source hg > 19/09/07 ced HG:rename-source hg > 19/09/03 sergi HG:rename-source hg > 19/09/02 ced HG:rename-source hg > 19/09/02 ced HG:rename-source hg > 19/09/02 ced HG:rename-source hg > 19/09/02 ced HG:rename-source hg > 19/09/02 ced HG:rename-source hg > 19/09/02 ced HG:rename-source hg > 19/08/30 ced HG:rename-source hg > 19/08/25 ced HG:rename-source hg > 19/08/25 ced HG:rename-source hg > 19/08/22 ced HG:rename-source hg > 19/08/18 ced HG:rename-source hg > 19/08/18 ced HG:rename-source hg > 19/08/18 ced HG:rename-source hg > 19/08/12 ced HG:rename-source hg > $ got log -l1 > ----------------------------------------------- > commit 95a150b9656dac2f13035784880138f52f6cf7a5 (develop) > from: Cédric Krier > date: Sat Sep 7 21:56:25 2019 UTC > HG:rename-source hg > > Add drag and drop and manage sequence > > issue8240 > review265181002 > > > I dunno how the mirror is done, but it wonder if the problem is on got > or on the mirror mercurial->git tool. > > But "git log" is fine. The problem is got assumes logmsg is one piece, whereas for git it is composed of "headers" + "messages" (a bit like a mail). Here for tryton, github is a mirror where the repository is mercurial format. The hg->git tool adds a custom header. In git, the related code is in pretty.c, function pp_header() (called from pretty_print_commit() function): https://github.com/git/git/blob/master/pretty.c#L1656 The separation between headers and logmsg is an empty line. Note that headers could be multiline and starts with " " (1 blank char), like for gpgsig header (the signature is multiline). By default, git log uses 'medium' format: only header lines for Author and Date are showed. git has also a 'full' format where Committer lines are showed too. And finally a 'raw' format where all headers are shown. Now, for got(1), I think by default "got log" should show only specific headers (date, author, commiter...), and discard any other headers. Only whitelisted headers would be printed. It would be a generic way to deal with gpgsig header too. With "got cat" we have the 'raw' format output, and all headers are showed. The following diff implements that. Any lines before first empty line is an header, and only whitelisted headers are showed by default. All next lines are showed (it is the commit message). This way explicit support for removing gpgsig from log could be discarded. Comments or OK ? -- Sebastien Marie diff 500467ff1bf0dbd15c0941dd741e80c35c708818 /home/semarie/repos/openbsd/got blob - b6cd712c154536cdfbd3c4dec1d9f014e4630378 file + lib/object_parse.c --- lib/object_parse.c +++ lib/object_parse.c @@ -419,16 +419,13 @@ got_object_commit_get_committer_gmtoff(struct got_comm return commit->committer_gmtoff; } -#define GOT_GPG_BEGIN_STR "gpgsig -----BEGIN PGP SIGNATURE-----" -#define GOT_GPG_END_STR " -----END PGP SIGNATURE-----" - const struct got_error * got_object_commit_get_logmsg(char **logmsg, struct got_commit_object *commit) { const struct got_error *err = NULL; - int gpgsig = 0; char *msg0, *msg, *line, *s; size_t len; + int headers = 1; *logmsg = NULL; @@ -436,32 +433,36 @@ got_object_commit_get_logmsg(char **logmsg, struct got if (msg0 == NULL) return got_error_from_errno("strdup"); - /* Copy log message line by line to strip out GPG sigs... */ + /* Copy log message line by line to strip out unusual headers... */ msg = msg0; do { - line = strsep(&msg, "\n"); + if ((line = strsep(&msg, "\n")) == NULL) + break; - if (line) { - /* Skip over GPG signatures. */ - if (gpgsig) { - if (strcmp(line, GOT_GPG_END_STR) == 0) { - gpgsig = 0; - /* Skip empty line after sig. */ - line = strsep(&msg, "\n"); - } + if (headers == 1) { + if (line[0] != '\0' && + strncmp(line, "tree ", 5) != 0 && + strncmp(line, "author ", 7) != 0 && + strncmp(line, "parent ", 7) != 0 && + strncmp(line, "committer ", 10) != 0 && + strncmp(line, "numparents ", 11) != 0 && + strncmp(line, "messagelen ", 11) != 0) { + continue; - } else if (strcmp(line, GOT_GPG_BEGIN_STR) == 0) { - gpgsig = 1; - continue; } - if (asprintf(&s, "%s%s\n", - *logmsg ? *logmsg : "", line) == -1) { - err = got_error_from_errno("asprintf"); - goto done; - } - free(*logmsg); - *logmsg = s; + + if (line[0] == '\0') + headers = 0; } + + if (asprintf(&s, "%s%s\n", + *logmsg ? *logmsg : "", line) == -1) { + err = got_error_from_errno("asprintf"); + goto done; + } + free(*logmsg); + *logmsg = s; + } while (line); /* Trim redundant trailing whitespace. */