"GOT", but the "O" is a cute, smiling pufferfish. Index | Thread | Search

From:
Stefan Sperling <stsp@stsp.name>
Subject:
Re: gotd: wait asynchronously for child termination
To:
Omar Polo <op@omarpolo.com>
Cc:
gameoftrees@openbsd.org
Date:
Thu, 22 Jun 2023 17:13:38 +0200

Download raw body.

Thread
On Thu, Jun 22, 2023 at 04:53:14PM +0200, Omar Polo wrote:
> diff /home/op/w/got
> commit - 9b9bad55e7a411af3e01d9e4a86c127876184726
> path + /home/op/w/got
> blob - 39b1d2a54632934de2bfaf127e20f0b0f725f4c8
> file + gotd/gotd.c
> --- gotd/gotd.c
> +++ gotd/gotd.c
> @@ -80,7 +80,11 @@ struct gotd_child_proc {
>  	char				 repo_path[PATH_MAX];
>  	int				 pipe[2];
>  	struct gotd_imsgev		 iev;
> +	struct event			 tmo;
> +
> +	TAILQ_ENTRY(gotd_child_proc)	 procs;

By convention used so far, the above variable should be called 'entry'.

> -static void
>  proc_done(struct gotd_child_proc *proc)
>  {
> -	event_del(&proc->iev.ev);
> -	msgbuf_clear(&proc->iev.ibuf.w);
> -	close(proc->iev.ibuf.fd);
> -	kill_proc(proc, 0);
> -	wait_for_child(proc->pid);
> +	struct gotd_client *clt;

Could you rename 'clt' to 'client' as used elsewhere?

> +
> +	TAILQ_REMOVE(&procs, proc, procs);
> +
> +	clt = find_client_by_proc_fd(proc->iev.ibuf.fd);
> +	if (clt != NULL) {
> +		if (proc == clt->repo)
> +			clt->repo = NULL;
> +		if (proc == clt->auth)
> +			clt->auth = NULL;
> +		if (proc == clt->session)
> +			clt->session = NULL;
> +		disconnect_on_error(clt, got_error(GOT_ERR_PRIVSEP_DIED));

I believe this error might arrive at the client user's terminal, and
they would see "unprivileged process died unexpectedly", even in the
code path where the child process terminates normally.

I would suggest to either use GOT_ERR_EOF which will be logged but
not be sent to the user, or simply call disconnect() directly.

> @@ -745,6 +735,20 @@ kill_proc(struct gotd_child_proc *proc, int fatal)
>  static void
>  kill_proc(struct gotd_child_proc *proc, int fatal)
>  {
> +	struct timeval tv = { 5, 0 };
> +
> +	log_debug("kill -%d %d", fatal ? SIGKILL : SIGTERM, proc->pid);
> +
> +	if (proc->iev.ibuf.fd != -1) {
> +		event_del(&proc->iev.ev);
> +		msgbuf_clear(&proc->iev.ibuf.w);
> +		close(proc->iev.ibuf.fd);
> +		proc->iev.ibuf.fd = -1;
> +	}
> +
> +	if (!evtimer_pending(&proc->tmo, NULL))
> +		evtimer_add(&proc->tmo, &tv);
> +

The above check needs && !fatal, otherwise we will schedule the
timeout again after it has already fired to trigger kill -9.

>  static void
> +kill_proc_timeout(int fd, short ev, void *d)
> +{
> +	struct gotd_child_proc *proc = d;
> +
> +	log_warnx("wait timeout for PID %d terminated", proc->pid);
> +	kill_proc(proc, 1);

I would rephrase the above as something like:
  "timeout waiting for PID %d to terminate; retrying with force"

> +static struct gotd_child_proc *
> +find_proc_by_pid(pid_t pid)
> +{
> +	struct gotd_child_proc *proc;

Must initialize to NULL above, otherwise we return garbage below if no
PID matches.

> +
> +	TAILQ_FOREACH(proc, &procs, procs)
> +		if (proc->pid == pid)
> +			break;
> +
> +	return proc;
> +}

> @@ -1508,6 +1560,13 @@ start_listener(char *argv0, const char *confpath, int 
>  	if (proc == NULL)
>  		fatal("calloc");
>  
> +	TAILQ_INSERT_HEAD(&procs, proc, procs);
> +
> +	/*
> +	 * XXX start_listener is called before event_init() so can't
> +	 * initialize proc->tmo here.
> +	 */
> +
>  	proc->type = PROC_LISTEN;

I would remove the XXX, I suspect that not calling event_init() before
starting the listener was deliberate. At least I vaguely recall having
problems with forking and event_init().

Perhaps mention that the timeout will be initialized in main() instead?