From: Stefan Sperling Subject: fix 'got blame' segfault To: gameoftrees@openbsd.org Date: Thu, 17 Dec 2020 21:35:46 +0100 On IRC jrick reported that this command triggers a segfault: $ got blame distrib/miniroot/install.sub assertion "blame->f2" failed: file "/home/jrick/src/got/tog/../lib/blame.c", line 480, function "close_file2_and_reuse_file1" In fact, both file pointers in the 'blame' struct are NULL. The problem is that the commit graph traverses one commit too far. This can be more easily reproduced with 'got log': got log -c 05f568 -P distrib/miniroot/install.sub This lists two commits: 05f568ecc6aadefa1aff9064a29e798874a71409 7c0d87f00e480cdf004324dad6f3e6f4418f8f42 commit 05f568ecc6aade... added the file distrib/miniroot/install.sub. The above 'got log' command returns not just this commit as expected, it additionally returns commit 7c0d87f00e480cd... The problem is a bug in the logic of got_object_tree_path_changed(), which causes the commit graph to add 7c0d87f00e480... to the list of commits which must still be traversed, even though the history of the logged path has already ended. Note that "distrib/miniroot" exists in 7c0d87f00e480cdf004324... but not in the parent of this commit. At the point where are traversing "distrib/miniroot" the tree entry te1 exists and corresponds to this tree listing the contents of 'miniroot' in 7c0d87f00e480cdf004324...: $ got cat 180aa33df8d1 32da652eda37ea59d2f1a37b2a289a95aff7aa2a 0100644 Makefile 40d8ab1f48c46b8ff676812ed68cce2de57f5aa4 0100644 list 8fce89bffd7b91b1d4e679f05f707c7981744560 0100644 list2sh.awk a0af0e56d446e56a5253d7ae25721ba405ac1898 0100644 makeconf.awk 27bcebb1e71174a0bbcc6d24495ce0894c02e54c 0100644 mtree.conf 1ea9ff121981ba5cf3042227aea3c455924e779d 0100644 runlist.sh $ And now try to open te2, which fails because "distrib/miniroot" does not exist in the parent commit of 7c0d87f00e480cdf004324...: te2 = find_entry_by_name(tree2, seg, seglen); if (te2 == NULL) { *changed = 1; goto done; } Because te2 cannot be opened, the code assumes success. We now return with err == NULL and *changed = 1; the commit graph interprets this as 'install.sub' having changed. What's obviously missing here is an existence check for 'install.sub' in te1! (Alternatively, we could allow the caller to differentiate between the cases where the file was 'added' and where the file was 'changed', so they can stop asking for more commits if the file was added. But that would require changing all callers...) The patch below ensures that we traverse the full path in tree1 even if an intermediate tree2 cannot be opened. Now we eventually hit this attempt to open the file 'install.sub' in tree object 180aa33df8d1 (tree1): te1 = find_entry_by_name(tree1, seg, seglen); if (te1 == NULL) { err = got_error(GOT_ERR_NO_OBJ); goto done; } The result is now ERR_NO_OBJ and *changed = 0. The commit graph will not add this commit to its iteration list. The command: $ got log -c 05f568 -P distrib/miniroot/install.sub will only list commit 05f568ecc6aade... which added the file. ok? diff e8f0f2f60d1eb5651a01b13607ca04014f092753 e8e36d3a9d44ec3db6e43abebe3db47e062f49a9 blob - 7c0aac05143e35448209231ec4347e3ad6e2252b blob + 7bd55922b5b83ac06fbadac4c69364802b1a6179 --- lib/object.c +++ lib/object.c @@ -1771,22 +1771,21 @@ got_object_tree_path_changed(int *changed, goto done; } - te2 = find_entry_by_name(tree2, seg, seglen); - if (te2 == NULL) { - *changed = 1; - goto done; - } + if (tree2) + te2 = find_entry_by_name(tree2, seg, seglen); - mode1 = normalize_mode_for_comparison(te1->mode); - mode2 = normalize_mode_for_comparison(te2->mode); - if (mode1 != mode2) { - *changed = 1; - goto done; - } + if (te2) { + mode1 = normalize_mode_for_comparison(te1->mode); + mode2 = normalize_mode_for_comparison(te2->mode); + if (mode1 != mode2) { + *changed = 1; + goto done; + } - if (got_object_id_cmp(&te1->id, &te2->id) == 0) { - *changed = 0; - goto done; + if (got_object_id_cmp(&te1->id, &te2->id) == 0) { + *changed = 0; + goto done; + } } if (*s == '\0') { /* final path element */ @@ -1807,14 +1806,20 @@ got_object_tree_path_changed(int *changed, got_object_tree_close(tree1); tree1 = next_tree1; - err = got_object_open_as_tree(&next_tree2, repo, - &te2->id); - te2 = NULL; - if (err) - goto done; - if (tree2 != tree02) - got_object_tree_close(tree2); - tree2 = next_tree2; + if (te2) { + err = got_object_open_as_tree(&next_tree2, repo, + &te2->id); + te2 = NULL; + if (err) + goto done; + if (tree2 != tree02) + got_object_tree_close(tree2); + tree2 = next_tree2; + } else if (tree2) { + if (tree2 != tree02) + got_object_tree_close(tree2); + tree2 = NULL; + } } } done: