"GOT", but the "O" is a cute, smiling pufferfish. Index | Thread | Search

From:
Stefan Sperling <stsp@stsp.name>
Subject:
cleanup: fix redundant pack file removal when repository contains symlinks
To:
gameoftrees@openbsd.org
Date:
Sat, 2 May 2026 16:38:02 +0200

Download raw body.

Thread
Add symlinks to the set of traversed object IDs during cleanup. Otherwise,
packfiles which contain blobs that are symlinks will always be marked as
containing unreferenced objects and 'gotadmin cleanup' will not remove them.

This can result in an explosion of redundant packfiles if 'gotadmin cleanup'
is run at regular intervals, and if the cleanup -a option is not used.

I am unsure how to write a test case for this fix without making additional
changes to the cleanup command. The tests run cleanup -a to bypass the mtime
staleness safety check. This will never hit the problematic case fixed here.
Reproducing the problem would require a test which waits for the (currently)
hard-coded interval of 600 seconds before running cleanup.
If we added a way to tweak the mtime threshold from tests then we could
probably add a test case which covers this issue.

Problem found because gothub.org backups were taking much more disk space
than expected.

ok?
 
M  lib/repository_admin.c  |  2+  3-

1 file changed, 2 insertions(+), 3 deletions(-)

commit - 9c2b559269ea5d0d6522a6c13343d3a4301ad1b6
commit + 94a6a19d195ba3956674745d754ce5747cdbe804
blob - 431fcc73acd1fbef092a6e8ade063a761cf08f63
blob + 929475bc8766bb1bf7fc3d60b5953ab3d5f78efc
--- lib/repository_admin.c
+++ lib/repository_admin.c
@@ -865,8 +865,7 @@ load_tree_entries(struct got_object_id_queue *ids,
 				break;
 		}
 
-		if (got_object_tree_entry_is_symlink(e) ||
-		    got_object_tree_entry_is_submodule(e) ||
+		if (got_object_tree_entry_is_submodule(e) ||
 		    got_object_idset_contains(traversed_ids, id))
 			continue;
 
@@ -882,7 +881,7 @@ load_tree_entries(struct got_object_id_queue *ids,
 			if (err)
 				break;
 			STAILQ_INSERT_TAIL(ids, qid, entry);
-		} else if (S_ISREG(mode)) {
+		} else if (S_ISREG(mode) || S_ISLNK(mode)) {
 			/* This blob is referenced. */
 			err = got_object_idset_add(traversed_ids, id, NULL);
 			if (err)