"GOT", but the "O" is a cute, smiling pufferfish. Index | Thread | Search

From:
"Omar Polo" <op@omarpolo.com>
Subject:
Re: cleanup: fix redundant pack file removal when repository contains symlinks
To:
Stefan Sperling <stsp@stsp.name>
Cc:
gameoftrees@openbsd.org
Date:
Sun, 03 May 2026 16:49:56 +0200

Download raw body.

Thread
Stefan Sperling <stsp@stsp.name> wrote:
> Add symlinks to the set of traversed object IDs during cleanup. Otherwise,
> packfiles which contain blobs that are symlinks will always be marked as
> containing unreferenced objects and 'gotadmin cleanup' will not remove them.
> 
> This can result in an explosion of redundant packfiles if 'gotadmin cleanup'
> is run at regular intervals, and if the cleanup -a option is not used.
> 
> I am unsure how to write a test case for this fix without making additional
> changes to the cleanup command. The tests run cleanup -a to bypass the mtime
> staleness safety check. This will never hit the problematic case fixed here.
> Reproducing the problem would require a test which waits for the (currently)
> hard-coded interval of 600 seconds before running cleanup.
> If we added a way to tweak the mtime threshold from tests then we could
> probably add a test case which covers this issue.

indeed.  but i'm a bit uneasy with making the hard-coded interval
tweakable?  maybe we can just don't document how to do so though.

> Problem found because gothub.org backups were taking much more disk space
> than expected.
> 
> ok?

ok op@, thanks!

> M  lib/repository_admin.c  |  2+  3-
> 
> 1 file changed, 2 insertions(+), 3 deletions(-)
> 
> commit - 9c2b559269ea5d0d6522a6c13343d3a4301ad1b6
> commit + 94a6a19d195ba3956674745d754ce5747cdbe804
> blob - 431fcc73acd1fbef092a6e8ade063a761cf08f63
> blob + 929475bc8766bb1bf7fc3d60b5953ab3d5f78efc
> --- lib/repository_admin.c
> +++ lib/repository_admin.c
> @@ -865,8 +865,7 @@ load_tree_entries(struct got_object_id_queue *ids,
>  				break;
>  		}
>  
> -		if (got_object_tree_entry_is_symlink(e) ||
> -		    got_object_tree_entry_is_submodule(e) ||
> +		if (got_object_tree_entry_is_submodule(e) ||
>  		    got_object_idset_contains(traversed_ids, id))
>  			continue;
>  
> @@ -882,7 +881,7 @@ load_tree_entries(struct got_object_id_queue *ids,
>  			if (err)
>  				break;
>  			STAILQ_INSERT_TAIL(ids, qid, entry);
> -		} else if (S_ISREG(mode)) {
> +		} else if (S_ISREG(mode) || S_ISLNK(mode)) {
>  			/* This blob is referenced. */
>  			err = got_object_idset_add(traversed_ids, id, NULL);
>  			if (err)