"GOT", but the "O" is a cute, smiling pufferfish. Index | Thread | Search

From:
Omar Polo <op@omarpolo.com>
Subject:
Re: teach gotadmin cleanup to remove redundant packfiles
To:
Stefan Sperling <stsp@stsp.name>
Cc:
gameoftrees@openbsd.org
Date:
Sun, 18 Jun 2023 12:17:47 +0200

Download raw body.

Thread
On 2023/06/17 10:16:35 +0200, Stefan Sperling <stsp@stsp.name> wrote:
> On Fri, Jun 16, 2023 at 11:25:27PM +0200, Omar Polo wrote:
> > A packfile becomes "redundant" when another packfile
> > contains all its objects; this usually happens as more objects are
> > added to a repository and the user issues a `gotadmin pack'
> 
> This assumption is only true for 'gotadmin pack -a'. Without -a, only
> loose objects which do not yet appear in any pack file will be packed.

yeah, I meant with -a.  fixed below.

> > The idea is simple, as per stsp' explanation on irc a while ago: find
> > the biggest packfile by number of objects, throw everything in a
> > idset, then iterate the other pack files and see if they're redundant.
> 
> > +If redundant copies of packed objects exist in loose form, such
> > +redundant copies will be purged.
> > +If all the objects of a pack file are present in another pack file,
> > +such redundant pack file will be purged.
> 
> With my suggested tweak to the algorithm I describe below, I would
> reword this to cover the more general case:
> 
>   If all the objects of a pack file are present in other pack files,
>   the redundant pack file will be purged.

use the sentence as-is

> > +.Pp
> > +For compatibility with Git, if a matching file
> > +.Pa .keep
> > +exists for a given pack file, such pack file won't be removed.
> 
> Grammar: s/such pack/this pack/

fixed

> > +.Pp
> >  Objects will usually become unreferenced as a result of deleting
> >  branches or tags with
> >  .Cm got branch -d
> > @@ -261,6 +268,10 @@ In order to determine the set of objects which are ref
> >  .Cm got ref -d
> >  may also leave unreferenced objects behind.
> >  .Pp
> > +Pack files will usually become redundant as a result of creating
> > +bigger packs with
> > +.Cm gotadmin pack .
> 
> As noted above, this example needs -a to be effective.

done

> > +	TAILQ_FOREACH(pe, &repo->packidx_paths, entry) {
> 
> By default the above list is in readdir(3) order, with entries for the
> most recently accessed packs moved to the front of the list.
> I think we should instead be using a custom list of pack index files here,
> sorted by the amount of object IDs they contain in descending order.
> Rationale below.
> 
> > +		if (cancel_cb) {
> > +			err = (*cancel_cb)(cancel_arg);
> > +			if (err)
> > +				break;
> > +		}
> > +
> > +		packidx_path = pe->path;
> > +		err = got_repo_get_packidx(&packidx, packidx_path,
> > +		    repo);
> > +		if (err)
> > +			goto done;
> > +
> > +		if (packidx == best) {
> > +			remove = 0;
> > +		} else {
> > +			nobjects = be32toh(packidx->hdr.fanout_table[0xff]);
> > +			for (i = 0; i < nobjects; ++i) {
> > +				pid = &packidx->hdr.sorted_ids[i];
> > +
> > +				memset(&id, 0, sizeof(id));
> > +				memcpy(&id.sha1, pid->sha1, sizeof(id.sha1));
> > +
> > +				if (!got_object_idset_contains(idset, &id))
> > +					break;
> > +			}
> > +
> > +			remove = (i == nobjects);
> 
> If we sorted the list of pack index files by the amount of objects in
> descending order then we could keep adding more IDs to 'idset' here,
> provided 'remove' is false. This way we would detect objects in any
> smaller packs which are made redundant by any bigger packs, instead
> of only catching objects which are made redundant by the one largest
> pack file.
> 
> We should then be able to eliminate some redundancy without having to
> re-pack everything, but only re-packing a particular branch such as 'main'.
> This might be useful for people who have fetched some branches but generally
> do not fetch very often. They would usually end up with a few large pack
> files instead of many small ones.

agreed, it's way better.

> To simulate this you could create two or 3 branches with unique commits and
> pack each such branch individually, using gotadmin pack -x to cut off the
> common ancestors found in the main branch, and create some redundant copies
> of each such pack. This way you could extend the regression test to cover a
> case where two or three packs comprise the full set and the rest are redundant.

done too, the last part of the added regress adds a branch to simulate
the case where two pack files comprise the history of `master' and a
second one the history of the new branch once split off master.

as discussed on irc, this still lacks the locking to avoid running
multiple cleanup and/or git gc in parallel.  i'll follow up a diff for
it.

diff /home/op/w/got
commit - ea4ee74a619549cdfbbf7f824599834f31b72838
path + /home/op/w/got
blob - 5278c5830ad6b4af4c8d43dbbe466b6a4851dd25
file + gotadmin/gotadmin.1
--- gotadmin/gotadmin.1
+++ gotadmin/gotadmin.1
@@ -232,8 +232,9 @@ Purge unreferenced loose objects from the repository a
 .Op Fl r Ar repository-path
 .Xc
 .Dl Pq alias: Cm cl
-Purge unreferenced loose objects from the repository and display
-the amount of disk space which has been freed as a result.
+Purge unreferenced loose objects and redundant pack files from the
+repository and display the amount of disk space which has been freed
+as a result.
 .Pp
 Unreferenced objects are present in the repository but cannot be
 reached via any reference in the entire
@@ -246,12 +247,18 @@ Packed objects stored in pack files under
 spread across 256 sub-directories named after the 256 possible
 hexadecimal values of the first byte of an object identifier.
 .Pp
-Packed objects stored in pack files under
-.Pa objects/pack/
-will not be purged.
-However, if redundant copies of packed objects exist in loose form,
-such redundant copies will be purged.
+Packed objects are stored in pack files under
+.Pa objects/pack/ .
 .Pp
+If redundant copies of packed objects exist in loose form, such
+redundant copies will be purged.
+If all the objects of a pack file are present in other pack files,
+the redundant pack file will be purged.
+.Pp
+For compatibility with Git, if a matching file
+.Pa .keep
+exists for a given pack file, this pack file won't be removed.
+.Pp
 Objects will usually become unreferenced as a result of deleting
 branches or tags with
 .Cm got branch -d
@@ -261,6 +268,11 @@ In order to determine the set of objects which are ref
 .Cm got ref -d
 may also leave unreferenced objects behind.
 .Pp
+Pack files will usually become redundant as a result of creating
+repacking the repository with
+.Nm
+.Cm pack Fl a .
+.Pp
 In order to determine the set of objects which are referenced, search
 all references for commit objects and tag objects, and traverse the
 corresponding tree object hierarchies.
blob - dee750d952183c77a15074c748c77e5a2510a5e5
file + gotadmin/gotadmin.c
--- gotadmin/gotadmin.c
+++ gotadmin/gotadmin.c
@@ -1136,16 +1136,19 @@ struct got_cleanup_progress_arg {
 	int last_nloose;
 	int last_ncommits;
 	int last_npurged;
+	int last_nredundant;
 	int verbosity;
 	int printed_something;
 	int dry_run;
 };
 
 static const struct got_error *
-cleanup_progress(void *arg, int nloose, int ncommits, int npurged)
+cleanup_progress(void *arg, int nloose, int ncommits, int npurged,
+    int nredundant)
 {
 	struct got_cleanup_progress_arg *a = arg;
 	int print_loose = 0, print_commits = 0, print_purged = 0;
+	int print_redundant = 0;
 
 	if (a->last_nloose != nloose) {
 		print_loose = 1;
@@ -1162,11 +1165,15 @@ cleanup_progress(void *arg, int nloose, int ncommits, 
 		print_purged = 1;
 		a->last_npurged = npurged;
 	}
+	if (a->last_nredundant != nredundant) {
+		print_redundant = 1;
+		a->last_nredundant = nredundant;
+	}
 
 	if (a->verbosity < 0)
 		return NULL;
 
-	if (print_loose || print_commits || print_purged)
+	if (print_loose || print_commits || print_purged || print_redundant)
 		printf("\r");
 	if (print_loose)
 		printf("%d loose object%s", nloose, nloose == 1 ? "" : "s");
@@ -1182,7 +1189,16 @@ cleanup_progress(void *arg, int nloose, int ncommits, 
 			    npurged == 1 ? "" : "s");
 		}
 	}
-	if (print_loose || print_commits || print_purged) {
+	if (print_redundant) {
+		if (a->dry_run) {
+			printf("%d pack file%s could be purged", nredundant,
+			    nredundant == 1 ? "" : "s");
+		} else {
+			printf("%d pack file%s purged", nredundant,
+			    nredundant == 1 ? "" : "s");
+		}
+	}
+	if (print_loose || print_commits || print_purged || print_redundant) {
 		a->printed_something = 1;
 		fflush(stdout);
 	}
@@ -1222,10 +1238,14 @@ cmd_cleanup(int argc, char *argv[])
 	int remove_lonely_packidx = 0, ignore_mtime = 0;
 	struct got_cleanup_progress_arg cpa;
 	struct got_lonely_packidx_progress_arg lpa;
-	off_t size_before, size_after;
-	char scaled_before[FMT_SCALED_STRSIZE];
-	char scaled_after[FMT_SCALED_STRSIZE];
-	char scaled_diff[FMT_SCALED_STRSIZE];
+	off_t loose_before, loose_after;
+	off_t pack_before, pack_after;
+	off_t total_size;
+	char loose_before_scaled[FMT_SCALED_STRSIZE];
+	char loose_after_scaled[FMT_SCALED_STRSIZE];
+	char pack_before_scaled[FMT_SCALED_STRSIZE];
+	char pack_after_scaled[FMT_SCALED_STRSIZE];
+	char total_size_scaled[FMT_SCALED_STRSIZE];
 	int *pack_fds = NULL;
 
 #ifndef PROFILE
@@ -1299,37 +1319,67 @@ cmd_cleanup(int argc, char *argv[])
 	memset(&cpa, 0, sizeof(cpa));
 	cpa.last_ncommits = -1;
 	cpa.last_npurged = -1;
+	cpa.last_nredundant = -1;
 	cpa.dry_run = dry_run;
 	cpa.verbosity = verbosity;
+
 	error = got_repo_purge_unreferenced_loose_objects(repo,
-	    &size_before, &size_after, &npacked, dry_run, ignore_mtime,
+	    &loose_before, &loose_after, &npacked, dry_run, ignore_mtime,
 	    cleanup_progress, &cpa, check_cancelled, NULL);
 	if (cpa.printed_something)
 		printf("\n");
 	if (error)
 		goto done;
+
+	cpa.printed_something = 0;
+	cpa.last_ncommits = -1;
+	cpa.last_npurged = -1;
+	cpa.last_nloose = -1;
+	cpa.last_nredundant = -1;
+	error = got_repo_purge_redundant_packfiles(repo, &pack_before,
+	    &pack_after, dry_run, cleanup_progress, &cpa,
+	    check_cancelled, NULL);
+	if (error)
+		goto done;
+	if (cpa.printed_something)
+		printf("\n");
+
+	total_size = (loose_before - loose_after) + (pack_before - pack_after);
+
 	if (cpa.printed_something) {
-		if (fmt_scaled(size_before, scaled_before) == -1) {
+		if (fmt_scaled(loose_before, loose_before_scaled) == -1) {
 			error = got_error_from_errno("fmt_scaled");
 			goto done;
 		}
-		if (fmt_scaled(size_after, scaled_after) == -1) {
+		if (fmt_scaled(loose_after, loose_after_scaled) == -1) {
 			error = got_error_from_errno("fmt_scaled");
 			goto done;
 		}
-		if (fmt_scaled(size_before - size_after, scaled_diff) == -1) {
+		if (fmt_scaled(pack_before, pack_before_scaled) == -1) {
 			error = got_error_from_errno("fmt_scaled");
 			goto done;
 		}
-		printf("loose total size before: %s\n", scaled_before);
-		printf("loose total size after: %s\n", scaled_after);
+		if (fmt_scaled(pack_after, pack_after_scaled) == -1) {
+			error = got_error_from_errno("fmt_scaled");
+			goto done;
+		}
+		if (fmt_scaled(total_size, total_size_scaled) == -1) {
+			error = got_error_from_errno("fmt_scaled");
+			goto done;
+		}
+		printf("loose total size before: %s\n", loose_before_scaled);
+		printf("loose total size after: %s\n", loose_after_scaled);
+		printf("pack files total size before: %s\n",
+		    pack_before_scaled);
+		printf("pack files total size after: %s\n", pack_after_scaled);
 		if (dry_run) {
 			printf("disk space which would be freed: %s\n",
-			    scaled_diff);
+			    total_size_scaled);
 		} else
-			printf("disk space freed: %s\n", scaled_diff);
+			printf("disk space freed: %s\n", total_size_scaled);
 		printf("loose objects also found in pack files: %d\n", npacked);
 	}
+
 done:
 	if (repo)
 		got_repo_close(repo);
blob - 5ddd50191b8acfb903877b148bb856de99014b07
file + include/got_repository_admin.h
--- include/got_repository_admin.h
+++ include/got_repository_admin.h
@@ -70,7 +70,7 @@ typedef const struct got_error *(*got_cleanup_progress
 
 /* A callback function which gets invoked with cleanup information to print. */
 typedef const struct got_error *(*got_cleanup_progress_cb)(void *arg,
-    int nloose, int ncommits, int npurged);
+    int nloose, int ncommits, int npurged, int nredundant);
 
 /*
  * Walk objects reachable via references to determine whether any loose
@@ -88,6 +88,12 @@ got_repo_purge_unreferenced_loose_objects(struct got_r
     int ignore_mtime, got_cleanup_progress_cb progress_cb, void *progress_arg,
     got_cancel_cb cancel_cb, void *cancel_arg);
 
+const struct got_error *
+got_repo_purge_redundant_packfiles(struct got_repository *repo,
+    off_t *before, off_t *size_after, int dry_run,
+    got_cleanup_progress_cb progress_cb, void *progress_arg,
+    got_cancel_cb cancel_cb, void *cancel_arg);
+
 /* A callback function which gets invoked with cleanup information to print. */
 typedef const struct got_error *(*got_lonely_packidx_progress_cb)(void *arg,
     const char *path);
blob - 48bed8a15e5f885c25ca796741f37fb70f1b87e3
file + lib/repository_admin.c
--- lib/repository_admin.c
+++ lib/repository_admin.c
@@ -628,7 +628,7 @@ report_cleanup_progress(got_cleanup_progress_cb progre
 	if (err || !elapsed)
 		return err;
 
-	return progress_cb(progress_arg, nloose, ncommits, npurged);
+	return progress_cb(progress_arg, nloose, ncommits, npurged, -1);
 }
 
 static const struct got_error *
@@ -1153,7 +1153,7 @@ got_repo_purge_unreferenced_loose_objects(struct got_r
 	if (nloose == 0) {
 		got_object_idset_free(loose_ids);
 		if (progress_cb) {
-			err = progress_cb(progress_arg, 0, 0, 0);
+			err = progress_cb(progress_arg, 0, 0, 0, -1);
 			if (err)
 				return err;
 		}
@@ -1217,7 +1217,8 @@ got_repo_purge_unreferenced_loose_objects(struct got_r
 
 	/* Produce a final progress report. */
 	if (progress_cb) {
-		err = progress_cb(progress_arg, nloose, ncommits, arg.npurged);
+		err = progress_cb(progress_arg, nloose, ncommits, arg.npurged,
+		    -1);
 		if (err)
 			goto done;
 	}
@@ -1228,6 +1229,200 @@ remove_packidx(int dir_fd, const char *relpath)
 }
 
 static const struct got_error *
+account_unlink_pack(struct got_repository *repo, const char *packidx_path,
+    int dry_run, int *remove, off_t *size_before, off_t *size_after)
+{
+	static const char *ext[] = {".idx", ".pack", ".rev", ".bitmap",
+	    ".promisor", ".mtimes"};
+	struct stat sb;
+	char *dot, path[PATH_MAX];
+	size_t i;
+
+	if (strlcpy(path, packidx_path, sizeof(path)) >=
+	    sizeof(path))
+		return got_error(GOT_ERR_NO_SPACE);
+
+	/*
+	 * For compatibility with Git, if a matching .keep file exist
+	 * don't delete the packfile.
+	 */
+	dot = strrchr(path, '.');
+	*dot = '\0';
+	if (strlcat(path, ".keep", sizeof(path)) >= sizeof(path))
+		return got_error(GOT_ERR_NO_SPACE);
+	if (faccessat(got_repo_get_fd(repo), path, F_OK, 0) == 0)
+		*remove = 0;
+
+	for (i = 0; i < nitems(ext); ++i) {
+		*dot = '\0';
+
+		if (strlcat(path, ext[i], sizeof(path)) >=
+		    sizeof(path))
+			return got_error(GOT_ERR_NO_SPACE);
+
+		if (fstatat(got_repo_get_fd(repo), path, &sb, 0) ==
+		    -1) {
+			if (errno == ENOENT &&
+			    strcmp(ext[i], ".pack") != 0 &&
+			    strcmp(ext[i], ".idx") != 0)
+				continue;
+			return got_error_from_errno2("fstatat", path);
+		}
+
+		*size_before += sb.st_size;
+		if (!*remove) {
+			*size_after += sb.st_size;
+			continue;
+		}
+
+		if (dry_run)
+			continue;
+
+		if (unlinkat(got_repo_get_fd(repo), path, 0) == -1) {
+			if (errno == ENOENT &&
+			    strcmp(ext[i], ".pack") != 0 &&
+			    strcmp(ext[i], ".idx") != 0)
+				continue;
+			return got_error_from_errno2("unlinkat",
+			    path);
+		}
+	}
+
+	return NULL;
+}
+
+static const struct got_error *
+idset_add_from_packidx(int *redundant, struct got_repository *repo,
+    const char *packidx_path, struct got_object_idset *idset)
+{
+	const struct got_error *err;
+	struct got_packidx *packidx;
+	struct got_packidx_object_id *pid;
+	struct got_object_id id;
+	size_t i, nobjects;
+
+	*redundant = 1;
+
+	err = got_repo_get_packidx(&packidx, packidx_path, repo);
+	if (err)
+		return err;
+
+	nobjects = be32toh(packidx->hdr.fanout_table[0xff]);
+	for (i = 0; i < nobjects; ++i) {
+		pid = &packidx->hdr.sorted_ids[i];
+
+		memset(&id, 0, sizeof(id));
+		memcpy(&id.sha1, pid->sha1, sizeof(id.sha1));
+
+		if (got_object_idset_contains(idset, &id))
+			continue;
+
+		*redundant = 0;
+		err = got_object_idset_add(idset, &id, NULL);
+		if (err)
+			return err;
+	}
+
+	return NULL;
+}
+
+struct pack_info {
+	const char	*path;
+	size_t		 nobjects;
+};
+
+static int
+pack_info_cmp(const void *a, const void *b)
+{
+	const struct pack_info	*pa, *pb;
+
+	pa = a;
+	pb = b;
+	if (pa->nobjects == pb->nobjects)
+		return strcmp(pa->path, pb->path);
+	if (pa->nobjects > pb->nobjects)
+		return -1;
+	return 1;
+}
+
+const struct got_error *
+got_repo_purge_redundant_packfiles(struct got_repository *repo,
+    off_t *size_before, off_t *size_after, int dry_run,
+    got_cleanup_progress_cb progress_cb, void *progress_arg,
+    got_cancel_cb cancel_cb, void *cancel_arg)
+{
+	const struct got_error *err;
+	struct pack_info *pinfo, *sorted = NULL;
+	struct got_packidx *packidx;
+	struct got_object_idset *idset = NULL;
+	struct got_pathlist_entry *pe;
+	size_t i, npacks;
+	int remove, redundant_packs = 0;
+
+	*size_before = 0;
+	*size_after = 0;
+
+	npacks = 0;
+	TAILQ_FOREACH(pe, &repo->packidx_paths, entry)
+		npacks++;
+
+	if (npacks == 0)
+		return NULL;
+
+	sorted = calloc(npacks, sizeof(*sorted));
+	if (sorted == NULL)
+		return got_error_from_errno("calloc");
+
+	i = 0;
+	TAILQ_FOREACH(pe, &repo->packidx_paths, entry) {
+		err = got_repo_get_packidx(&packidx, pe->path, repo);
+		if (err)
+			goto done;
+
+		pinfo = &sorted[i++];
+		pinfo->path = pe->path;
+		pinfo->nobjects = be32toh(packidx->hdr.fanout_table[0xff]);
+	}
+	qsort(sorted, npacks, sizeof(*sorted), pack_info_cmp);
+
+	idset = got_object_idset_alloc();
+	if (idset == NULL) {
+		err = got_error_from_errno("got_object_idset_alloc");
+		goto done;
+	}
+
+	for (i = 0; i < npacks; ++i) {
+		if (cancel_cb) {
+			err = (*cancel_cb)(cancel_arg);
+			if (err)
+				break;
+		}
+
+		err = idset_add_from_packidx(&remove, repo, sorted[i].path,
+		    idset);
+		if (err)
+			goto done;
+		err = account_unlink_pack(repo, sorted[i].path, dry_run,
+		    &remove, size_before, size_after);
+		if (err)
+			goto done;
+		if (!remove)
+			continue;
+		err = progress_cb(progress_arg, -1, -1, -1,
+		    ++redundant_packs);
+		if (err)
+			goto done;
+	}
+
+	err = progress_cb(progress_arg, -1, -1, -1, redundant_packs);
+ done:
+	free(sorted);
+	if (idset)
+		got_object_idset_free(idset);
+	return err;
+}
+
+static const struct got_error *
 remove_packidx(int dir_fd, const char *relpath)
 {
 	const struct got_error *err, *unlock_err;
blob - 2381958704d73cb1ebaeb0f7af8d2bf01b7d434f
file + regress/cmdline/cleanup.sh
--- regress/cmdline/cleanup.sh
+++ regress/cmdline/cleanup.sh
@@ -237,6 +237,123 @@ test_cleanup_precious_objects() {
 	test_done "$testroot" "$ret"
 }
 
+test_cleanup_redundant_pack_files() {
+	local testroot=`test_init cleanup_redundant_pack_files`
+
+	# no pack files should exist yet
+
+	n=$(gotadmin info -r "$testroot/repo" | awk '/^pack files/{print $3}')
+	if [ "$n" -ne 0 ]; then
+		echo "expected no pack file to exists, $n found" >&2
+		test_done "$testroot" 1
+		return 1
+	fi
+
+	# create a redundant pack with an associated .keep file
+	hash=$(gotadmin pack -a -r "$testroot/repo" \
+		| awk '/^Indexed/{print $2}')
+	kpack="$testroot/repo/.git/objects/pack/pack-$hash"
+	touch "${kpack%.pack}.keep"
+
+	# create a few pack files, all with a different number of
+	# objects
+	for i in `jot 5`; do
+		echo "alpha $i" > $testroot/repo/alpha
+		git_commit "$testroot/repo" -m "edit #$i"
+		gotadmin pack -r "$testroot/repo" >/dev/null
+	done
+
+	# create two packs with all the objects
+	gotadmin pack -a -r "$testroot/repo" >/dev/null
+	gotadmin pack -a -r "$testroot/repo" >/dev/null
+
+	gotadmin cleanup -r "$testroot/repo" | grep 'pack files? purged' \
+		| tail -1 > $testroot/stdout
+
+	echo "5 pack files purged" > $testroot/stdout.expected
+	if cmp -s "$testroot/stdout.expected" "$testroot/stdout"; then
+		diff -u "$testroot/stdout.expected" "$testroot/stdout"
+		test_done "$testroot" 1
+		return 1
+	fi
+
+	n=$(gotadmin info -r "$testroot/repo" | awk '/^pack files/{print $3}')
+	if [ "$n" -ne 2 ]; then
+		echo "expected 2 pack files left, $n found instead" >&2
+		test_done "$testroot" 1
+		return 1
+	fi
+
+	if [ ! -f "$kpack" ]; then
+		echo "$kpack disappeared unexpectedly" >&2
+		test_done "$testroot" 1
+		return 1
+	fi
+
+	if [ ! -f "${kpack%.pack}.keep" ]; then
+		echo "${kpack%.pack}.keep disappeared unexpectedly" >&2
+		test_done "$testroot" 1
+		return 1
+	fi
+
+	# create one more non-redundant pack
+	for i in `jot 5`; do
+		echo "alpha again $i" > $testroot/repo/alpha
+		git_commit "$testroot/repo" -m "edit $i"
+	done
+	gotadmin pack -r "$testroot/repo" >/dev/null
+
+	gotadmin cleanup -r "$testroot/repo" | grep 'pack files? purged' \
+	    | tail -1 > $testroot/stdout
+
+	echo "0 pack files purged" > $testroot/stdout.expected
+	if cmp -s "$testroot/stdout.expected" "$testroot/stdout"; then
+		diff -u "$testroot/stdout.expected" "$testroot/stdout"
+		test_done "$testroot" 1
+		return 1
+	fi
+
+	n=$(gotadmin info -r "$testroot/repo" | awk '/^pack files/{print $3}')
+	if [ "$n" -ne 3 ]; then
+		echo "expected 3 pack files left, $n found instead" >&2
+		test_done "$testroot" 1
+		return 1
+	fi
+
+	# remove the .keep file
+	rm "${kpack%.pack}.keep"
+
+	# create some commits on a separate branch
+	(cd "$testroot/repo" && git checkout -q -b newbranch)
+
+	for i in `jot 5`; do
+		echo "alpha $i" > $testroot/repo/alpha
+		git_commit "$testroot/repo" -m "edit #$i"
+		gotadmin pack -r "$testroot/repo" >/dev/null
+	done
+
+	gotadmin pack -a -x master -r "$testroot/repo" >/dev/null
+
+	gotadmin cleanup -r "$testroot/repo" | grep 'pack files? purged' \
+		| tail -1 > $testroot/stdout
+
+	echo "6 pack files purged" > $testroot/stdout.expected
+	if cmp -s "$testroot/stdout.expected" "$testroot/stdout"; then
+		diff -u "$testroot/stdout.expected" "$testroot/stdout"
+		test_done "$testroot" 1
+		return 1
+	fi
+
+	n=$(gotadmin info -r "$testroot/repo" | awk '/^pack files/{print $3}')
+	if [ "$n" -ne 3 ]; then
+		echo "expected 3 pack files left, $n found instead" >&2
+		test_done "$testroot" 1
+		return 1
+	fi
+
+	test_done "$testroot" 0
+}
+
 test_cleanup_precious_objects() {
 	local testroot=`test_init cleanup_precious_objects`
 
@@ -365,5 +482,6 @@ run_test test_cleanup_precious_objects
 test_parseargs "$@"
 run_test test_cleanup_unreferenced_loose_objects
 run_test test_cleanup_redundant_loose_objects
+run_test test_cleanup_redundant_pack_files
 run_test test_cleanup_precious_objects
 run_test test_cleanup_missing_pack_file