From: Stefan Sperling Subject: fix 'got fetch' downloading too many objects To: gameoftrees@openbsd.org Date: Mon, 27 Sep 2021 19:52:45 +0200 I've observed 'got fetch' downloading a much larger pack than expected. This problem has been known for some time, and I have now figured out what is going wrong. In the case observed the patch below reduces the number of objects fetched from 57034 to 1147. And the size of the downloaded pack changed from >100 MB to 2MB. got fetch makes a mistake when advertising references contained in the local repository to the server. Unless we are running in mirror mode, we only advertise refs/tags/ and any branches in refs/remotes/ which correspond to the branches we want to fetch. Other existing references are not advertised. So, for example, when a repository is switched out of mirror mode and we are now fetching 'refs/heads/main', got fetch tells the server that it does not have any objects yet. There is no refs/remotes/origin/main which would point at relevant objecs because we were in mirror mode before. The fact that there already is a copy of many objects reachable via refs/heads/main (created in mirror mode) is ignored. The fix is to simply tell the server about everything we have, no matter which references we want to download. The code is already set up to distinguish between references which we have and references we want to fetch. The server will now always send us a minimal pack file. ok? diff 088449d31db27c8682d5e9dc737d92d05df6605e /home/stsp/src/got blob - 5206b456ca799c99ae1e3454f99d30f377a63361 file + lib/fetch.c --- lib/fetch.c +++ lib/fetch.c @@ -124,8 +124,6 @@ got_fetch_pack(struct got_object_id **pack_hash, struc off_t packfile_size = 0; struct got_packfile_hdr pack_hdr; uint32_t nobj = 0; - char *ref_prefix = NULL; - size_t ref_prefixlen = 0; char *path; char *progress = NULL; @@ -153,13 +151,6 @@ got_fetch_pack(struct got_object_id **pack_hash, struc TAILQ_INIT(&have_refs); TAILQ_INIT(&my_refs); - if (!mirror_references) { - if (asprintf(&ref_prefix, "refs/remotes/%s/", - remote_name) == -1) - return got_error_from_errno("asprintf"); - ref_prefixlen = strlen(ref_prefix); - } - if (!list_refs_only) { err = got_ref_list(&my_refs, repo, NULL, got_ref_cmp_by_name, NULL); @@ -174,60 +165,17 @@ got_fetch_pack(struct got_object_id **pack_hash, struc if (got_ref_is_symbolic(re->ref)) continue; - refname = got_ref_get_name(re->ref); - - if (mirror_references) { - char *name; - err = got_ref_resolve(&id, repo, re->ref); - if (err) - goto done; - name = strdup(refname); - if (name == NULL) { - err = got_error_from_errno("strdup"); - goto done; - } - err = got_pathlist_append(&have_refs, name, id); - if (err) - goto done; - continue; + err = got_ref_resolve(&id, repo, re->ref); + if (err) + goto done; + refname = strdup(got_ref_get_name(re->ref)); + if (refname == NULL) { + err = got_error_from_errno("strdup"); + goto done; } - - if (strncmp("refs/tags/", refname, 10) == 0) { - char *tagname; - - err = got_ref_resolve(&id, repo, re->ref); - if (err) - goto done; - tagname = strdup(refname); - if (tagname == NULL) { - err = got_error_from_errno("strdup"); - goto done; - } - err = got_pathlist_append(&have_refs, tagname, id); - if (err) { - free(tagname); - goto done; - } - } - - if (strncmp(ref_prefix, refname, ref_prefixlen) == 0) { - char *branchname; - - err = got_ref_resolve(&id, repo, re->ref); - if (err) - goto done; - - if (asprintf(&branchname, "refs/heads/%s", - refname + ref_prefixlen) == -1) { - err = got_error_from_errno("asprintf"); - goto done; - } - err = got_pathlist_append(&have_refs, branchname, id); - if (err) { - free(branchname); - goto done; - } - } + err = got_pathlist_append(&have_refs, refname, id); + if (err) + goto done; } if (list_refs_only) { @@ -577,7 +525,6 @@ done: free(tmpidxpath); free(idxpath); free(packpath); - free(ref_prefix); free(progress); TAILQ_FOREACH(pe, &have_refs, entry) {