From: Stefan Sperling Subject: make 'got send' detect closed connections To: gameoftrees@openbsd.org Date: Tue, 12 Sep 2023 21:17:38 +0200 I am seeing an issue while trying to send changes to gothub, where the gotsh process on the server runs into an error (cause unknown). This in turn triggers a problem where 'got send' does not realize that unnecessary work is being done: $ got send -b qwx gothub Connecting to "gothub" ssh://stsp@gothub.org/src.git Host key fingerprint is SHA256:HT7ohAIc4knlmUm3vFUyyK9/DW/sFccZuPG0oNwBWYA 16419 commits coloredgotsh: operation timed out 228961 commits colored; 48 objects found; 7581 trees scanned packing 1 reference; 48 objects; deltify: 100%; writing pack: 109K 100% got: privsep peer process closed pipe $ There are two errors in the above trace: 1) gotsh: operation timed out 2) got: privsep peer process closed pipe The first error happens while we are coloring commits. We do not notice the problem at that point, however. We keep building the pack file, upload it (not sure where uploaded data goes -- perhaps ssh discards it?) and then we finally see a problem. An ssh debug trace shows that ssh detects the connection being closed when gotsh reports error 1: [[[ got-send-pack: updating refs/heads/qwx c351eb807cdb2a9571bc3dde12654f23949e7c45 -> 903db0c03af248b610c1338fc1340e55bfa9074a got-send-pack: writepkt: 0000 16785 commits coloreddebug2: channel 0: rcvd ext data 27 gotsh: operation timed out debug2: channel 0: written 27 to efd 6 debug2: channel 0: rcvd eof debug2: channel 0: output open -> drain debug2: channel 0: obuf empty ]]] The patch below makes 'got send' detect the dead connection while generating the pack file. Is there a better way than regularly polling the file descriptor to detect a POLLHUP? I don't have a better idea. It is effective though: $ got send -b qwx gothub Connecting to "gothub" ssh://stsp@gothub.org/src.git Host key fingerprint is SHA256:HT7ohAIc4knlmUm3vFUyyK9/DW/sFccZuPG0oNwBWYA 15062 commits coloredgotsh: operation timed out 18982 commits colored got: server unexpectedly closed the connection $ Tests keep passing with this diff. I don't think it makes sense to add test coverage for this corner case. It is largely a cosmetic problem. ok? ----------------------------------------------- make 'got send' detect connections that are unexpectedly closed by the server diff 0ec0e499ac07b3fa388b786eb0cb489bd810df26 8f2e9307cd98624ff5f6f213470917a1afc0c923 commit - 0ec0e499ac07b3fa388b786eb0cb489bd810df26 commit + 8f2e9307cd98624ff5f6f213470917a1afc0c923 blob - 8c63c1485e235199cc2ba118a4a8617670a97c05 blob + 1a570fe9fa2c3408f2ffec0964979a5d0a37d4ef --- lib/send.c +++ lib/send.c @@ -70,6 +70,7 @@ #include "got_lib_pack_create.h" #include "got_lib_dial.h" #include "got_lib_worktree_cvg.h" +#include "got_lib_poll.h" #ifndef nitems #define nitems(_a) (sizeof((_a)) / sizeof((_a)[0])) @@ -108,6 +109,7 @@ got_send_connect(pid_t *sendpid, int *sendfd, const ch struct pack_progress_arg { got_send_progress_cb progress_cb; void *progress_arg; + int sendfd; int ncolored; int nfound; @@ -133,6 +135,19 @@ pack_progress(void *arg, int ncolored, int nfound, int if (err) return err; + /* + * Detect the server closing our connection while we are + * busy creating a pack file. + */ + err = got_poll_fd(a->sendfd, 0, 0); + if (err && err->code != GOT_ERR_TIMEOUT) { + if (err->code == GOT_ERR_EOF) { + err = got_error_msg(GOT_ERR_EOF, + "server unexpectedly closed the connection"); + } + return err; + } + a->ncolored= ncolored; a->nfound = nfound; a->ntrees = ntrees; @@ -591,6 +606,7 @@ got_send_pack(const char *remote_name, struct got_path memset(&ppa, 0, sizeof(ppa)); ppa.progress_cb = progress_cb; ppa.progress_arg = progress_arg; + ppa.sendfd = sendfd; err = got_pack_create(packsha1, packfd, delta_cache, their_ids, ntheirs, our_ids, nours, repo, 0, 1, 0, pack_progress, &ppa, &rl, cancel_cb, cancel_arg);