From: "Omar Polo" Subject: Re: cleanup: fix redundant pack file removal when repository contains symlinks To: Stefan Sperling Cc: gameoftrees@openbsd.org Date: Sun, 03 May 2026 16:49:56 +0200 Stefan Sperling wrote: > Add symlinks to the set of traversed object IDs during cleanup. Otherwise, > packfiles which contain blobs that are symlinks will always be marked as > containing unreferenced objects and 'gotadmin cleanup' will not remove them. > > This can result in an explosion of redundant packfiles if 'gotadmin cleanup' > is run at regular intervals, and if the cleanup -a option is not used. > > I am unsure how to write a test case for this fix without making additional > changes to the cleanup command. The tests run cleanup -a to bypass the mtime > staleness safety check. This will never hit the problematic case fixed here. > Reproducing the problem would require a test which waits for the (currently) > hard-coded interval of 600 seconds before running cleanup. > If we added a way to tweak the mtime threshold from tests then we could > probably add a test case which covers this issue. indeed. but i'm a bit uneasy with making the hard-coded interval tweakable? maybe we can just don't document how to do so though. > Problem found because gothub.org backups were taking much more disk space > than expected. > > ok? ok op@, thanks! > M lib/repository_admin.c | 2+ 3- > > 1 file changed, 2 insertions(+), 3 deletions(-) > > commit - 9c2b559269ea5d0d6522a6c13343d3a4301ad1b6 > commit + 94a6a19d195ba3956674745d754ce5747cdbe804 > blob - 431fcc73acd1fbef092a6e8ade063a761cf08f63 > blob + 929475bc8766bb1bf7fc3d60b5953ab3d5f78efc > --- lib/repository_admin.c > +++ lib/repository_admin.c > @@ -865,8 +865,7 @@ load_tree_entries(struct got_object_id_queue *ids, > break; > } > > - if (got_object_tree_entry_is_symlink(e) || > - got_object_tree_entry_is_submodule(e) || > + if (got_object_tree_entry_is_submodule(e) || > got_object_idset_contains(traversed_ids, id)) > continue; > > @@ -882,7 +881,7 @@ load_tree_entries(struct got_object_id_queue *ids, > if (err) > break; > STAILQ_INSERT_TAIL(ids, qid, entry); > - } else if (S_ISREG(mode)) { > + } else if (S_ISREG(mode) || S_ISLNK(mode)) { > /* This blob is referenced. */ > err = got_object_idset_add(traversed_ids, id, NULL); > if (err)