"GOT", but the "O" is a cute, smiling pufferfish. Index | Thread | Search

From:
Stefan Sperling <stsp@stsp.name>
Subject:
Re: got fetch downloads too much
To:
Christian Weisgerber <naddy@mips.inka.de>
Cc:
gameoftrees@openbsd.org
Date:
Wed, 6 Oct 2021 20:30:33 +0200

Download raw body.

Thread
On Wed, Oct 06, 2021 at 07:19:15PM +0200, Christian Weisgerber wrote:
> Stefan Sperling:
> 
> > Is it reproducible? You could try "rewinding" those refs to the
> > commits which you started fetching with, e.g. with 'got ref'.
> 
> $ got ref -c 7f4bb501768b9b3f856f91fbc8b9c80a2a4aaa04 refs/remotes/origin/main
> $ got ref -c f7e56b48260646f8b1d5d619a92ca1b2c2d26efc refs/remotes/origin/stable/12
> 
> > And then fetch again. Does it still fetch more than it should?
> 
> It does:
> 
> Connecting to "origin" anongit@git.freebsd.org
> server: Enumerating objects: 82599, done.
> server: Counting objects: 100% (19209/19209), done.
> server: Compressing objects: 100% (12254/12254), done.
> server: Total 82599 (delta 13980), reused 6955 (delta 6955), pack-reused 63390
>     75M fetched; indexing 100%; resolving deltas 100%
> Fetched 8bbc0e2ea7ac9610a329d754bff75fc96724f1cb.pack
> Updated refs/remotes/origin/main: da3278ded3b2647d26da26788bab8363e502a144
> Updated refs/remotes/origin/stable/12: 6fea4b82e7b86ac680d5615f8361863353737325
> Updated refs/remotes/origin/stable/13: b1cca74367374bbb9cdc881c671a9f9525dca313
> 
> got.conf:
> 
> remote "origin" {
> 	server anongit@git.freebsd.org
> 	protocol ssh
> 	repository "/src.git"
> 	fetch-all-branches yes
> }


My guess is that you did not not yet rebase the corresponding branches
in refs/heads before fetching again?

In that case, the current code keeps fetching the objects again in order
to get "new" objects for e.g. refs/heads/main. Here is a quick hack to
prevent this. See the comment added by the patch for details.

diff 0c079dbc7e6ed9857d4a13d908cc6858d57c81ec /home/stsp/src/got
blob - 96308310d28fd9186f083ce3c0028cceb9cd61b5
file + libexec/got-fetch-pack/got-fetch-pack.c
--- libexec/got-fetch-pack/got-fetch-pack.c
+++ libexec/got-fetch-pack/got-fetch-pack.c
@@ -65,6 +65,29 @@ static const struct got_capability got_capabilities[] 
 	{ GOT_CAPA_SIDE_BAND_64K, NULL },
 };
 
+static int
+match_remote_object_id(struct got_pathlist_head *have_refs,
+    struct got_object_id *their_id)
+{
+	struct got_pathlist_entry *pe;
+
+	/*
+	 * This only matches the tip commit of each of our references.
+	 * A better approach might be to ask the main process if their
+	 * object ID is contained anywhere in our repository. However,
+	 * searching the tips is good enough to avoid repeated downloads
+	 * in case we have cached the current tip in refs/remotes/ but
+	 * the corresponding branch in refs/heads/ has not been rebased.
+	 */
+	TAILQ_FOREACH(pe, have_refs, entry) {
+		struct got_object_id *id = pe->data;
+		if (got_object_id_cmp(id, their_id) == 0) 
+			return 1;
+	}
+
+	return 0;
+}
+
 static void
 match_remote_ref(struct got_pathlist_head *have_refs,
     struct got_object_id *my_id, char *refname)
@@ -457,6 +480,14 @@ fetch_pack(int fd, int packfd, uint8_t *pack_sha1,
 			err = got_error(GOT_ERR_BAD_OBJ_ID_STR);
 			goto done;
 		}
+		if (match_remote_object_id(have_refs, &want[nref])) {
+			if (chattygot) {
+				fprintf(stderr,
+				    "%s: not fetching %s, we already have %s\n",
+				    getprogname(), refname, id_str);
+			}
+			continue;
+		}
 		match_remote_ref(have_refs, &have[nref], refname);
 		err = send_fetch_ref(ibuf, &want[nref], refname);
 		if (err)