"GOT", but the "O" is a cute, smiling pufferfish. Index | Thread | Search

From:
Stefan Sperling <stsp@stsp.name>
Subject:
fix 'got blame' segfault
To:
gameoftrees@openbsd.org
Date:
Thu, 17 Dec 2020 21:35:46 +0100

Download raw body.

Thread
On IRC jrick reported that this command triggers a segfault:

  $ got blame distrib/miniroot/install.sub

  assertion "blame->f2" failed: file "/home/jrick/src/got/tog/../lib/blame.c", line 480, function "close_file2_and_reuse_file1"

In fact, both file pointers in the 'blame' struct are NULL.

The problem is that the commit graph traverses one commit too far.
This can be more easily reproduced with 'got log':

 got log -c 05f568 -P distrib/miniroot/install.sub

This lists two commits:
  05f568ecc6aadefa1aff9064a29e798874a71409
  7c0d87f00e480cdf004324dad6f3e6f4418f8f42 

commit 05f568ecc6aade... added the file distrib/miniroot/install.sub.
The above 'got log' command returns not just this commit as expected,
it additionally returns commit 7c0d87f00e480cd...

The problem is a bug in the logic of got_object_tree_path_changed(),
which causes the commit graph to add 7c0d87f00e480...  to the list of
commits which must still be traversed, even though the history of the
logged path has already ended.

Note that "distrib/miniroot" exists in 7c0d87f00e480cdf004324... but not
in the parent of this commit.

At the point where are traversing "distrib/miniroot" the tree entry te1
exists and corresponds to this tree listing the contents of 'miniroot'
in 7c0d87f00e480cdf004324...:

$ got cat 180aa33df8d1
32da652eda37ea59d2f1a37b2a289a95aff7aa2a 0100644 Makefile
40d8ab1f48c46b8ff676812ed68cce2de57f5aa4 0100644 list
8fce89bffd7b91b1d4e679f05f707c7981744560 0100644 list2sh.awk
a0af0e56d446e56a5253d7ae25721ba405ac1898 0100644 makeconf.awk
27bcebb1e71174a0bbcc6d24495ce0894c02e54c 0100644 mtree.conf
1ea9ff121981ba5cf3042227aea3c455924e779d 0100644 runlist.sh
$

And now try to open te2, which fails because "distrib/miniroot" does not
exist in the parent commit of 7c0d87f00e480cdf004324...:

		te2 = find_entry_by_name(tree2, seg, seglen);
		if (te2 == NULL) {
			*changed = 1;
			goto done;
		}

Because te2 cannot be opened, the code assumes success.
We now return with err == NULL and *changed = 1; the commit graph interprets
this as 'install.sub' having changed. What's obviously missing here is an
existence check for 'install.sub' in te1! (Alternatively, we could allow
the caller to differentiate between the cases where the file was 'added'
and where the file was 'changed', so they can stop asking for more commits
if the file was added. But that would require changing all callers...)

The patch below ensures that we traverse the full path in tree1 even if an
intermediate tree2 cannot be opened. Now we eventually hit this attempt to
open the file 'install.sub' in tree object 180aa33df8d1 (tree1):

		te1 = find_entry_by_name(tree1, seg, seglen);
		if (te1 == NULL) {
			err = got_error(GOT_ERR_NO_OBJ);
			goto done;
		}

The result is now ERR_NO_OBJ and *changed = 0. The commit graph will not
add this commit to its iteration list. The command:

 $ got log -c 05f568 -P distrib/miniroot/install.sub

will only list commit 05f568ecc6aade... which added the file.

ok?

diff e8f0f2f60d1eb5651a01b13607ca04014f092753 e8e36d3a9d44ec3db6e43abebe3db47e062f49a9
blob - 7c0aac05143e35448209231ec4347e3ad6e2252b
blob + 7bd55922b5b83ac06fbadac4c69364802b1a6179
--- lib/object.c
+++ lib/object.c
@@ -1771,22 +1771,21 @@ got_object_tree_path_changed(int *changed,
 			goto done;
 		}
 
-		te2 = find_entry_by_name(tree2, seg, seglen);
-		if (te2 == NULL) {
-			*changed = 1;
-			goto done;
-		}
+		if (tree2)
+			te2 = find_entry_by_name(tree2, seg, seglen);
 
-		mode1 = normalize_mode_for_comparison(te1->mode);
-		mode2 = normalize_mode_for_comparison(te2->mode);
-		if (mode1 != mode2) {
-			*changed = 1;
-			goto done;
-		}
+		if (te2) {
+			mode1 = normalize_mode_for_comparison(te1->mode);
+			mode2 = normalize_mode_for_comparison(te2->mode);
+			if (mode1 != mode2) {
+				*changed = 1;
+				goto done;
+			}
 
-		if (got_object_id_cmp(&te1->id, &te2->id) == 0) {
-			*changed = 0;
-			goto done;
+			if (got_object_id_cmp(&te1->id, &te2->id) == 0) {
+				*changed = 0;
+				goto done;
+			}
 		}
 
 		if (*s == '\0') { /* final path element */
@@ -1807,14 +1806,20 @@ got_object_tree_path_changed(int *changed,
 				got_object_tree_close(tree1);
 			tree1 = next_tree1;
 
-			err = got_object_open_as_tree(&next_tree2, repo,
-			    &te2->id);
-			te2 = NULL;
-			if (err)
-				goto done;
-			if (tree2 != tree02)
-				got_object_tree_close(tree2);
-			tree2 = next_tree2;
+			if (te2) {
+				err = got_object_open_as_tree(&next_tree2, repo,
+				    &te2->id);
+				te2 = NULL;
+				if (err)
+					goto done;
+				if (tree2 != tree02)
+					got_object_tree_close(tree2);
+				tree2 = next_tree2;
+			} else if (tree2) {
+				if (tree2 != tree02)
+					got_object_tree_close(tree2);
+				tree2 = NULL;
+			}
 		}
 	}
 done: