From: Stefan Sperling Subject: Re: cleanup: always remove redundant packs To: Omar Polo Cc: gameoftrees@openbsd.org Date: Sun, 22 Oct 2023 21:14:27 +0200 On Sun, Oct 22, 2023 at 08:40:13PM +0200, Omar Polo wrote: > stsp reported that `gotadmin pack -a && gotadmin cleanup' wasn't working > in one case since rsync was constantly messing with mtime. > > looking closely, I don't think we have to treat "recentish" pack files > in a special way. If they're redundant, they can go, regardless of when > they were created. > > I've taken a quick look at git' builtin/repack.c and it doesn't seem to > care about the {c,m}time of pack files. From another quick look, gotd > should be fine too. > > I probably got over-cautious and copied the loose object time check, > with for packs is not needed. If a pack is redundant, all of its > contents are already available in the repository so there's nothing to > loose by removing it. Loose object instead may be installed and only > later being pointed by a reference, so we still need the creation check. > > ok? traversed_ids is shared between repo_purge_unreferenced_loose_objects() and repo_purge_redundant_packfiles(). And loose objects are traversed first. Which probably means that a pack file composed of objects that all exist as loose objects will be considered redundant and deleted. I suspect without the timestamp check creating a pack file and running cleanup concurrently can now result in a race where the newly created pack file is deleted. I would suggest to use separate travered_ids sets for loose objects and for pack files to avoid this issue.