"GOT", but the "O" is a cute, smiling pufferfish. Index | Thread | Search

From:
Stefan Sperling <stsp@stsp.name>
Subject:
make 'got send' detect closed connections
To:
gameoftrees@openbsd.org
Date:
Tue, 12 Sep 2023 21:17:38 +0200

Download raw body.

Thread
I am seeing an issue while trying to send changes to gothub, where
the gotsh process on the server runs into an error (cause unknown).

This in turn triggers a problem where 'got send' does not realize
that unnecessary work is being done:

$ got send -b qwx gothub
Connecting to "gothub" ssh://stsp@gothub.org/src.git
Host key fingerprint is SHA256:HT7ohAIc4knlmUm3vFUyyK9/DW/sFccZuPG0oNwBWYA
16419 commits coloredgotsh: operation timed out
228961 commits colored; 48 objects found; 7581 trees scanned
packing 1 reference; 48 objects; deltify: 100%; writing pack:  109K 100%
got: privsep peer process closed pipe
$ 

There are two errors in the above trace:

 1) gotsh: operation timed out
 2) got: privsep peer process closed pipe

The first error happens while we are coloring commits. We do not notice the
problem at that point, however. We keep building the pack file, upload it
(not sure where uploaded data goes -- perhaps ssh discards it?) and then
we finally see a problem.

An ssh debug trace shows that ssh detects the connection being closed
when gotsh reports error 1:

[[[
got-send-pack: updating refs/heads/qwx c351eb807cdb2a9571bc3dde12654f23949e7c45 -> 903db0c03af248b610c1338fc1340e55bfa9074a
got-send-pack: writepkt: 0000
16785 commits coloreddebug2: channel 0: rcvd ext data 27
gotsh: operation timed out
debug2: channel 0: written 27 to efd 6
debug2: channel 0: rcvd eof
debug2: channel 0: output open -> drain
debug2: channel 0: obuf empty
]]]

The patch below makes 'got send' detect the dead connection while
generating the pack file.
Is there a better way than regularly polling the file descriptor to
detect a POLLHUP? I don't have a better idea. It is effective though:

$ got send -b qwx gothub
Connecting to "gothub" ssh://stsp@gothub.org/src.git
Host key fingerprint is SHA256:HT7ohAIc4knlmUm3vFUyyK9/DW/sFccZuPG0oNwBWYA
15062 commits coloredgotsh: operation timed out
18982 commits colored
got: server unexpectedly closed the connection
$

Tests keep passing with this diff. I don't think it makes sense to add
test coverage for this corner case. It is largely a cosmetic problem.

ok?


-----------------------------------------------
 make 'got send' detect connections that are unexpectedly closed by the server
 
diff 0ec0e499ac07b3fa388b786eb0cb489bd810df26 8f2e9307cd98624ff5f6f213470917a1afc0c923
commit - 0ec0e499ac07b3fa388b786eb0cb489bd810df26
commit + 8f2e9307cd98624ff5f6f213470917a1afc0c923
blob - 8c63c1485e235199cc2ba118a4a8617670a97c05
blob + 1a570fe9fa2c3408f2ffec0964979a5d0a37d4ef
--- lib/send.c
+++ lib/send.c
@@ -70,6 +70,7 @@
 #include "got_lib_pack_create.h"
 #include "got_lib_dial.h"
 #include "got_lib_worktree_cvg.h"
+#include "got_lib_poll.h"
 
 #ifndef nitems
 #define nitems(_a)	(sizeof((_a)) / sizeof((_a)[0]))
@@ -108,6 +109,7 @@ got_send_connect(pid_t *sendpid, int *sendfd, const ch
 struct pack_progress_arg {
     got_send_progress_cb progress_cb;
     void *progress_arg;
+    int sendfd;
 
     int ncolored;
     int nfound;
@@ -133,6 +135,19 @@ pack_progress(void *arg, int ncolored, int nfound, int
 	if (err)
 		return err;
 
+	/*
+	 * Detect the server closing our connection while we are
+	 * busy creating a pack file.
+	 */
+	err = got_poll_fd(a->sendfd, 0, 0);
+	if (err && err->code != GOT_ERR_TIMEOUT) {
+		if (err->code == GOT_ERR_EOF) {
+			err = got_error_msg(GOT_ERR_EOF,
+			    "server unexpectedly closed the connection");
+		}
+		return err;
+	}
+	 
 	a->ncolored= ncolored;
 	a->nfound = nfound;
 	a->ntrees = ntrees;
@@ -591,6 +606,7 @@ got_send_pack(const char *remote_name, struct got_path
 		memset(&ppa, 0, sizeof(ppa));
 		ppa.progress_cb = progress_cb;
 		ppa.progress_arg = progress_arg;
+		ppa.sendfd = sendfd;
 		err = got_pack_create(packsha1, packfd, delta_cache,
 		    their_ids, ntheirs, our_ids, nours, repo, 0, 1, 0,
 		    pack_progress, &ppa, &rl, cancel_cb, cancel_arg);