"GOT", but the "O" is a cute, smiling pufferfish. Index | Thread | Search

From:
Omar Polo <op@omarpolo.com>
Subject:
Re: improve diff output for consumption by got patch
To:
Stefan Sperling <stsp@stsp.name>
Cc:
gameoftrees@openbsd.org
Date:
Wed, 22 Jun 2022 17:55:20 +0200

Download raw body.

Thread
Stefan Sperling <stsp@stsp.name> wrote:
> I would like to add two additional lines to our diff headers at the
> beginning of patch, to show the type and IDs of the objects being
> compared. The first line of our diff header does not provide this
> information consistently. It sometimes shows reference names instead
> of object IDs, for example. And it is sometimes unclear what type
> of object an object Id shown on the initial 'diff ...' line refers to.
> 
> The idea is that got patch can simply look for a line such as:
> 
>   commit - abcde1234567...
> 
> to find the merge base commit ID. Of course, the blob ID from a
> blob - header line is enough to run the actual 3-way merge of a file.
> But we use the commit ID in conflict markers if it is available,
> which should now always be the case for diffs created by Got.
> (Granted, this commit ID is not guaranteed to actually exist in
> the local repository. But, in general, people should be sending
> out diffs against already published commits.)
> 
> When diffing two tree objects directly, the diff header now shows this:
> 
>   diff abcdef1234567... 1234567abcdef...
>   tree - abcdef1234567...
>   tree + 1234567abcdef...
> 
> When diffing a work tree, we now see these lines in the diff header:
> 
>   diff /path/to/work/tree
>   commit - abcdef1234567...
>   path + /path/to/work/tree
> 
> And a header for staged changes looks like this:
> 
>   diff -s /path/to/work/tree
>   commit - abcdef1234567...
>   path + /path/to/work/tree (staged changes)

I like this changes.  Now the first line, the "diff" one, mirrors what
the user actually typed (ok, with the addition of the path to the work
tree) and the metadata makes it more clear what type of object those IDs
refers to.

It also makes the first part of the diff slightly more chatty but
possibly more useful?  Can't say, but my first reaction is that I like
it.  (I'm biased since it makes my work in 'got patch' a bit easier ;-)

> Did anyone realize that got log -p actually displayed tree or blob IDs
> rather than commit IDs in its diff header?

No, never noticed!  I thought it always used commit IDs.

> This has been fixed here,
> such that log -p now shows commit IDs in a manner which is consistent
> with other diff output modes.

This is definitely a great improvement IMHO

> 'got patch' no longer needs to parse object IDs from initial 'diff ... ...'
> line after this change. But, for now, I am keeping support for it in case
> someone still has an old-style diff they want to apply. The parser should
> probably use the first diff line as an indicator that a got-style patch
> might follow, but that is left for later.
> 
> ok?

would be nice to have another point of view, but for me it's OK.

one very very small nitpick:

> +static const struct got_error *
> +show_object_id(off_t **line_offsets, size_t *nlines, const char *obj_typestr,
> +    int c, const char *id_str, FILE *outfile)

I'd used char for `c' to make it more explicit that we're expecting a
char ('+' or '-' to be precise) instead of an arbitrary integer.

> +{
> +	const struct got_error *err;
> +	int n;
> +	off_t outoff = 0;
> +
> +	n = fprintf(outfile, "%s %c %s\n", obj_typestr, c, id_str);
> +	if (line_offsets != NULL && *line_offsets != NULL) {
> +		if (*nlines == 0) {
> +			err = add_line_offset(line_offsets, nlines, 0);
> +			if (err)
> +				return err;
> +		} else
> +			outoff = (*line_offsets)[*nlines - 1];
> +
> +		outoff += n;
> +		err = add_line_offset(line_offsets, nlines, outoff);
> +		if (err)
> +			return err;
> +	}
> +
> +	return NULL;
> +}


Thanks,

Omar Polo