From: Mark Jamsek Subject: Re: always fetch remote HEAD except when -b is used To: Christian Weisgerber , gameoftrees@openbsd.org Date: Thu, 16 Feb 2023 19:28:40 +1100 On 23-02-14 04:48PM, Stefan Sperling wrote: > On Tue, Feb 14, 2023 at 04:17:09PM +0100, Christian Weisgerber wrote: > > Mark Jamsek: > > > > > As discussed with stsp on irc, this tweaks 'got fetch' again so that the > > > branch resolved via the remote's HEAD ref is not just a fallback--it is > > > always fetched except in the 'got fetch -b branch' case. > > > > > > This necessitated passing the bflag to got_fetch_pack from cmd_clone(), > > > too, so that we don't fetch HEAD when 'got clone -b branch' is used. > > > > I'm struggling to make sense of the semantics here. > > > > $ got clone -b stable/13 ssh://anongit@git.freebsd.org/src.git > > ... > > $ cd src.git > > $ got ref -l | fgrep -v /tags/ > > HEAD: refs/heads/stable/13 > > refs/heads/stable/13: acaed6b1e2859cb6726e35e8c937b192c16a95c8 > > refs/remotes/origin/stable/13: acaed6b1e2859cb6726e35e8c937b192c16a95c8 > > $ got fetch > > ... > > $ got ref -l | fgrep -v /tags/ > > HEAD: refs/heads/stable/13 > > refs/heads/main: 825fbd087e6150eaf601612a5e7468ddc808e004 > > refs/heads/stable/13: acaed6b1e2859cb6726e35e8c937b192c16a95c8 > > refs/remotes/origin/HEAD: refs/remotes/origin/main > > refs/remotes/origin/main: 825fbd087e6150eaf601612a5e7468ddc808e004 > > refs/remotes/origin/stable/13: acaed6b1e2859cb6726e35e8c937b192c16a95c8 > > > > Presumably, if I limit "clone" to a particular branch, that's all > > I'm interested in, and I don't expect the next "fetch" to add the > > remote HEAD branch. > > Fetching HEAD by default prevents people from potentially working > on a stale main branch, e.g. when a project switches to a different > main branch and updates HEAD accordingly. > This can happen for various reasons, such as master -> main renames > which were common over the last few years. And in our own -portable > repository the main branch is still called "linux" for historical > reasons, which could change in the future. > Our assumption was that everything else will generally be close to or > derived from HEAD, and that fetching the extra data should not be a burden. > > But you present a use case which I did not consider. > Presumably in this situation a stable-only clone will receive less changes > over time than HEAD does, so your repository now stores more commits than > actually needed (to fix it, remove the references you don't want and then > run 'git gc', potentially with an option like --prune=now). > > Regarding got fetch behaviour, I see several ways forward: The below diff implements stsp's suggestion: > We could fetch HEAD by default only if the symref refs/remotes/foo/HEAD > already exits, and if its target no longer matches the remote HEAD. > This would preserve the behaviour you want for -stable repositories > while also satisfying the above concern. I haven't adjusted regress yet because it gives a good idea of what this change does, and I'd also like to make sure we've got the desired behaviour right before adjusting regress. All tests pass except of course three of the fetch tests: got-test-fetch_branch foo is now the default HEAD branch in $testroot/repo got.conf still says to fetch "master" we expect master but because HEAD's target changed we fetch foo got-test-fetch_update_headref we change HEAD from master to foo: got ref -r $testroot/repo -s refs/heads/foo HEAD but we still expect to fetch HEAD -> master got-test-fetch_honor_wt_conf_bflag remote HEAD has changed from master to boo master is up-to-date so we expect to fetch no updates but we fetch the new HEAD boo (The $LINENO regress diff really helps in this case :) In addition to only fetching the remote HEAD if it already exists and its target has changed, we also fetch HEAD if -b is not used and either there are no branches set in got.conf and there's no work tree, or said branches are not found on the server. This, imo, gives us the key behaviour we were initially after that in part led to these fetch tweaks, while also allowing for naddy's use case in the OP; that is: - user's won't inadvertently miss it if HEAD changes - fetch will not fail due to not finding branches except if -b is used - 'got fetch' will only fetch if you cloned -b So I think this idea of stsp's really is the best so far! diffstat /home/mark/src/got M got/got.1 | 4+ 0- M got/got.c | 53+ 9- M include/got_fetch.h | 2+ 2- M lib/fetch.c | 3+ 2- M lib/got_lib_privsep.h | 3+ 1- M lib/privsep.c | 15+ 5- M libexec/got-fetch-pack/got-fetch-pack.c | 38+ 11- 7 files changed, 118 insertions(+), 30 deletions(-) diff /home/mark/src/got commit - ac51614e087279834159667c6f71fffe546cffa6 path + /home/mark/src/got blob - d9209e2ae1dd1c1aa01e368bc0016cf86830a9e7 file + got/got.1 --- got/got.1 +++ got/got.1 @@ -358,6 +358,10 @@ This default behaviour can be overridden with the .Xr got.conf 5 or via a work tree, or said branches are not found on the server, a branch resolved via the remote repository's HEAD reference will be fetched. +Likewise, if a HEAD reference for the +.Ar remote-repository +exists but its target no longer matches the remote HEAD, then +the new target branch will be fetched. This default behaviour can be overridden with the .Fl a and blob - 1ef76ea95b69bf7e45b91f9053a232d241269e7d file + got/got.c --- got/got.c +++ got/got.c @@ -1544,7 +1544,7 @@ cmd_clone(int argc, char *argv[]) struct got_fetch_progress_arg fpa; char *git_url = NULL; int verbosity = 0, fetch_all_branches = 0, mirror_references = 0; - int list_refs_only = 0; + int bflag = 0, list_refs_only = 0; int *pack_fds = NULL; TAILQ_INIT(&refs); @@ -1562,6 +1562,7 @@ cmd_clone(int argc, char *argv[]) optarg, NULL); if (error) return error; + bflag = 1; break; case 'l': list_refs_only = 1; @@ -1717,7 +1718,7 @@ cmd_clone(int argc, char *argv[]) error = got_fetch_pack(&pack_hash, &refs, &symrefs, GOT_FETCH_DEFAULT_REMOTE_NAME, mirror_references, fetch_all_branches, &wanted_branches, &wanted_refs, - list_refs_only, verbosity, fetchfd, repo, NULL, 0, + list_refs_only, verbosity, fetchfd, repo, NULL, NULL, bflag, fetch_progress, &fpa); if (error) goto done; @@ -2282,6 +2283,8 @@ cmd_fetch(int argc, char *argv[]) const struct got_gotconfig *repo_conf = NULL, *worktree_conf = NULL; struct got_pathlist_head refs, symrefs, wanted_branches, wanted_refs; struct got_pathlist_entry *pe; + struct got_reflist_head remote_refs; + struct got_reflist_entry *re; struct got_object_id *pack_hash = NULL; int i, ch, fetchfd = -1, fetchstatus; pid_t fetchpid = -1; @@ -2289,10 +2292,11 @@ cmd_fetch(int argc, char *argv[]) int verbosity = 0, fetch_all_branches = 0, list_refs_only = 0; int delete_refs = 0, replace_tags = 0, delete_remote = 0; int *pack_fds = NULL, have_bflag = 0; - const char *worktree_branch = NULL; + const char *remote_head = NULL, *worktree_branch = NULL; TAILQ_INIT(&refs); TAILQ_INIT(&symrefs); + TAILQ_INIT(&remote_refs); TAILQ_INIT(&wanted_branches); TAILQ_INIT(&wanted_refs); @@ -2531,12 +2535,51 @@ cmd_fetch(int argc, char *argv[]) if (error) goto done; - if (worktree && !have_bflag) { - const char *refname; + if (!have_bflag) { + /* + * If set, get this remote's HEAD ref target so if + * its target has changed on the server we can fetch it. + */ + error = got_ref_list(&remote_refs, repo, "refs/remotes", + got_ref_cmp_by_name, repo); + if (error) + goto done; - refname = got_worktree_get_head_ref_name(worktree); - if (strncmp(refname, "refs/heads/", 11) == 0) - worktree_branch = refname; + TAILQ_FOREACH(re, &remote_refs, entry) { + const char *remote_refname, *remote_target; + size_t remote_name_len; + + if (!got_ref_is_symbolic(re->ref)) + continue; + + remote_name_len = strlen(remote->name); + remote_refname = got_ref_get_name(re->ref); + + /* we only want refs/remotes/$remote->name/HEAD */ + if (strncmp(remote_refname + 13, remote->name, + remote_name_len) != 0) + continue; + + if (strcmp(remote_refname + remote_name_len + 14, + GOT_REF_HEAD) != 0) + continue; + + /* + * Take the name itself because we already + * only match with refs/heads/ in fetch_pack(). + */ + remote_target = got_ref_get_symref_target(re->ref); + remote_head = remote_target + remote_name_len + 14; + break; + } + + if (worktree) { + const char *refname; + + refname = got_worktree_get_head_ref_name(worktree); + if (strncmp(refname, "refs/heads/", 11) == 0) + worktree_branch = refname; + } } fpa.last_scaled_size[0] = '\0'; @@ -2551,7 +2594,7 @@ cmd_fetch(int argc, char *argv[]) error = got_fetch_pack(&pack_hash, &refs, &symrefs, remote->name, remote->mirror_references, fetch_all_branches, &wanted_branches, &wanted_refs, list_refs_only, verbosity, fetchfd, repo, - worktree_branch, have_bflag, fetch_progress, &fpa); + worktree_branch, remote_head, have_bflag, fetch_progress, &fpa); if (error) goto done; @@ -2730,6 +2773,7 @@ done: got_pathlist_free(&symrefs, GOT_PATHLIST_FREE_ALL); got_pathlist_free(&wanted_branches, GOT_PATHLIST_FREE_NONE); got_pathlist_free(&wanted_refs, GOT_PATHLIST_FREE_NONE); + got_ref_list_free(&remote_refs); free(id_str); free(cwd); free(repo_path); blob - 5768c9157c81ca12b00f65491c540dd712c37e6a file + include/got_fetch.h --- include/got_fetch.h +++ include/got_fetch.h @@ -46,5 +46,5 @@ const struct got_error *got_fetch_pack(struct got_obje const struct got_error *got_fetch_pack(struct got_object_id **, struct got_pathlist_head *, struct got_pathlist_head *, const char *, int, int, struct got_pathlist_head *, struct got_pathlist_head *, - int, int, int, struct got_repository *, const char *, int, - got_fetch_progress_cb, void *); + int, int, int, struct got_repository *, const char *, const char *, + int, got_fetch_progress_cb, void *); blob - e37808ed2a0bf37771cbf76e39cfa92b60ad43cc file + lib/fetch.c --- lib/fetch.c +++ lib/fetch.c @@ -105,7 +105,8 @@ got_fetch_pack(struct got_object_id **pack_hash, struc struct got_pathlist_head *wanted_branches, struct got_pathlist_head *wanted_refs, int list_refs_only, int verbosity, int fetchfd, struct got_repository *repo, const char *worktree_refname, - int no_head, got_fetch_progress_cb progress_cb, void *progress_arg) + const char *remote_head, int no_head, got_fetch_progress_cb progress_cb, + void *progress_arg) { size_t i; int imsg_fetchfds[2], imsg_idxfds[2]; @@ -261,7 +262,7 @@ got_fetch_pack(struct got_object_id **pack_hash, struc } err = got_privsep_send_fetch_req(&fetchibuf, nfetchfd, &have_refs, fetch_all_branches, wanted_branches, wanted_refs, - list_refs_only, worktree_refname, no_head, verbosity); + list_refs_only, worktree_refname, remote_head, no_head, verbosity); if (err != NULL) goto done; nfetchfd = -1; blob - b8170cb9afee37b2625c705535e43662ca7dd06c file + lib/got_lib_privsep.h --- lib/got_lib_privsep.h +++ lib/got_lib_privsep.h @@ -409,10 +409,12 @@ struct got_imsg_fetch_request { int list_refs_only; int verbosity; size_t worktree_branch_len; + size_t remote_head_len; size_t n_have_refs; size_t n_wanted_branches; size_t n_wanted_refs; /* Followed by worktree_branch_len bytes of reference name. */ + /* Followed by remote_head_len bytes of reference name. */ /* Followed by n_have_refs GOT_IMSG_FETCH_HAVE_REF messages. */ /* Followed by n_wanted_branches times GOT_IMSG_FETCH_WANTED_BRANCH. */ /* Followed by n_wanted_refs times GOT_IMSG_FETCH_WANTED_REF. */ @@ -701,7 +703,7 @@ const struct got_error *got_privsep_send_fetch_req(str int *, int *, struct imsgbuf *ibuf); const struct got_error *got_privsep_send_fetch_req(struct imsgbuf *, int, struct got_pathlist_head *, int, struct got_pathlist_head *, - struct got_pathlist_head *, int, const char *, int, int); + struct got_pathlist_head *, int, const char *, const char *, int, int); const struct got_error *got_privsep_send_fetch_outfd(struct imsgbuf *, int); const struct got_error *got_privsep_recv_fetch_progress(int *, struct got_object_id **, char **, struct got_pathlist_head *, char **, blob - ed227bda3d7a4ce423d882c19ae47793a302c675 file + lib/privsep.c --- lib/privsep.c +++ lib/privsep.c @@ -532,19 +532,23 @@ got_privsep_send_fetch_req(struct imsgbuf *ibuf, int f struct got_pathlist_head *have_refs, int fetch_all_branches, struct got_pathlist_head *wanted_branches, struct got_pathlist_head *wanted_refs, int list_refs_only, - const char *worktree_branch, int no_head, int verbosity) + const char *worktree_branch, const char *remote_head, + int no_head, int verbosity) { const struct got_error *err = NULL; struct ibuf *wbuf; - size_t len, worktree_branch_len; struct got_pathlist_entry *pe; struct got_imsg_fetch_request fetchreq; + size_t remote_head_len, worktree_branch_len, len = sizeof(fetchreq); if (worktree_branch) { worktree_branch_len = strlen(worktree_branch); - len = sizeof(fetchreq) + worktree_branch_len; - } else - len = sizeof(fetchreq); + len += worktree_branch_len; + } + if (remote_head) { + remote_head_len = strlen(remote_head); + len += remote_head_len; + } if (len >= MAX_IMSGSIZE - IMSG_HEADER_SIZE) { close(fd); @@ -562,6 +566,8 @@ got_privsep_send_fetch_req(struct imsgbuf *ibuf, int f fetchreq.verbosity = verbosity; if (worktree_branch != NULL) fetchreq.worktree_branch_len = worktree_branch_len; + if (remote_head != NULL) + fetchreq.remote_head_len = remote_head_len; TAILQ_FOREACH(pe, have_refs, entry) fetchreq.n_have_refs++; TAILQ_FOREACH(pe, wanted_branches, entry) @@ -574,6 +580,10 @@ got_privsep_send_fetch_req(struct imsgbuf *ibuf, int f if (imsg_add(wbuf, worktree_branch, worktree_branch_len) == -1) return got_error_from_errno("imsg_add FETCH_REQUEST"); } + if (remote_head) { + if (imsg_add(wbuf, remote_head, remote_head_len) == -1) + return got_error_from_errno("imsg_add FETCH_REQUEST"); + } wbuf->fd = fd; imsg_close(ibuf, wbuf); blob - 2a22df95ff37a15b31f41c7c01f263af824ddc15 file + libexec/got-fetch-pack/got-fetch-pack.c --- libexec/got-fetch-pack/got-fetch-pack.c +++ libexec/got-fetch-pack/got-fetch-pack.c @@ -333,7 +333,8 @@ fetch_pack(int fd, int packfd, uint8_t *pack_sha1, struct got_pathlist_head *have_refs, int fetch_all_branches, struct got_pathlist_head *wanted_branches, struct got_pathlist_head *wanted_refs, int list_refs_only, - const char *worktree_branch, int no_head, struct imsgbuf *ibuf) + const char *worktree_branch, const char *remote_head, + int no_head, struct imsgbuf *ibuf) { const struct got_error *err = NULL; char buf[GOT_PKT_MAX]; @@ -501,14 +502,30 @@ fetch_pack(int fd, int packfd, uint8_t *pack_sha1, if (list_refs_only) goto done; - if (!found_branch && !no_head && default_branch && default_id_str && + /* + * If -b was not used and either none of the requested branches + * (got.conf, worktree) were found or the client already has the + * remote HEAD symref but its target changed, fetch remote's HEAD. + */ + if (!no_head && default_branch && default_id_str && strncmp(default_branch, "refs/heads/", 11) == 0) { - err = fetch_ref(ibuf, have_refs, &have[nref], - &want[nref], default_branch, default_id_str); - if (err) - goto done; - nref++; - found_branch = 1; + int remote_head_changed = 0; + + if (remote_head) { + char *headname; + + headname = strrchr(default_branch, '/'); + if (headname && strcmp(remote_head, headname + 1) != 0) + remote_head_changed = 1; + } + + if (!found_branch || remote_head_changed) { + err = fetch_ref(ibuf, have_refs, &have[nref], + &want[nref], default_branch, default_id_str); + if (err) + goto done; + nref++; + } } /* Abort if we haven't found anything to fetch. */ @@ -834,7 +851,7 @@ main(int argc, char **argv) struct got_imsg_fetch_wanted_branch wbranch; struct got_imsg_fetch_wanted_ref wref; size_t datalen, i; - char *worktree_branch = NULL; + char *remote_head = NULL, *worktree_branch = NULL; #if 0 static int attached; while (!attached) @@ -875,7 +892,7 @@ main(int argc, char **argv) fetchfd = imsg.fd; if (datalen != sizeof(fetch_req) + - fetch_req.worktree_branch_len) { + fetch_req.worktree_branch_len + fetch_req.remote_head_len) { err = got_error(GOT_ERR_PRIVSEP_LEN); goto done; } @@ -889,6 +906,15 @@ main(int argc, char **argv) } } + if (fetch_req.remote_head_len != 0) { + remote_head = strndup(imsg.data + sizeof(fetch_req) + + fetch_req.worktree_branch_len, fetch_req.remote_head_len); + if (remote_head == NULL) { + err = got_error_from_errno("strndup"); + goto done; + } + } + imsg_free(&imsg); if (fetch_req.verbosity > 0) @@ -1045,9 +1071,10 @@ main(int argc, char **argv) err = fetch_pack(fetchfd, packfd, pack_sha1, &have_refs, fetch_req.fetch_all_branches, &wanted_branches, &wanted_refs, fetch_req.list_refs_only, - worktree_branch, fetch_req.no_head, &ibuf); + worktree_branch, remote_head, fetch_req.no_head, &ibuf); done: free(worktree_branch); + free(remote_head); got_pathlist_free(&have_refs, GOT_PATHLIST_FREE_ALL); got_pathlist_free(&wanted_branches, GOT_PATHLIST_FREE_PATH); if (fetchfd != -1 && close(fetchfd) == -1 && err == NULL) -- Mark Jamsek GPG: F2FF 13DE 6A06 C471 CA80 E6E2 2930 DC66 86EE CF68