"GOT", but the "O" is a cute, smiling pufferfish. Index | Thread | Search

From:
Tracey Emery <tracey@traceyemery.net>
Subject:
Re: struct for priv_fds
To:
gameoftrees@openbsd.org
Date:
Mon, 30 May 2022 09:20:29 -0600

Download raw body.

Thread
On Mon, May 30, 2022 at 04:35:10PM +0200, Stefan Sperling wrote:
> On Mon, May 30, 2022 at 07:55:39AM -0600, Tracey Emery wrote:
> > Then, what about something like below as a starting point? An array can
> > be loaded up and this would give us an association for each fd.
> 
> The overall direction is unclear to me. More below:
> 
> > diff d6a28ffe187127e3247254d7e242bb52d66eb26b /home/tracey/src/got
> > blob - b6e44e8b40b8476a471c53fd10cffd4a7ff3b32d
> > file + include/got_repository.h
> > --- include/got_repository.h
> > +++ include/got_repository.h
> > @@ -177,3 +177,12 @@ const struct got_error *got_repo_get_loose_object_info
> >  /* Obtain the number and size of packed objects in the repository. */
> >  const struct got_error *got_repo_get_packfile_info(int *npackfiles,
> >      int *nobjects, off_t *total_packsize, struct got_repository *);
> > +
> > +/* Privileged file descriptors */
> > +enum got_priv_fds {
> > +	DIFF_FD,
> > +	BLAME_FD,
> > +	PACK_BASE_FD,
> > +	PACK_ACCUM_FD,
> > +	PRIV_FDS__MAX,
> > +};
> > 
> 
> There was the diff below which I wrote to start moving use of
> got_opentemp() out of files in the lib/ directory.
> 
> The idea here is to prepare two file descriptors for each running
> pack file helper at got_repo_open() time. The number of tempfiles we
> keep open is constant, because helpers which die pass their open
> tempfiles on to new helpers that take over when we switch to reading
> a different pack.
> 
> As a future extension to this, we could require callers of got_repo_open()
> to pass an array of open tempfiles, instead of having got_repo_open()
> call got_opentempfd() internally. The patch below is just the first
> step in that direction, since the files are still being opened inside
> the function itself.
> 
> This patch only addresses the base/accum tempfile situation. But the
> same idea could be applied to blame/diff/etc. Over time, all frontends
> would be tweaked to provide the necessary tempfiles upfront, depending
> on the operation they want to run. And gotwebd would do the same.
> 
> The current APIs clearly don't fit with what gotwebd needs, because they
> manage resources which must be managed by gotwebd instead. I think a good
> way to fix this is to work on pushing each got_opentemp() call further out
> to the API boundary. By doing so we learn how many open files the callers
> will need to provide in each case.
> 
> I don't see why you need the above enum in got_repository.h.
> In the end, gotwebd would work just like the other frontends, and call the
> lib/ APIs with required arguments, some of which will be open file handles,
> or arrays of open file handles.
> Maybe gotwebd needs to keep track of a couple of such arrays and use an
> enum for that purpose. But could it not keep this enum private instead of
> declaring it in got_repository.h, which is pulled in everywhere?
> 

Yes, this does not belong in got_repository.h. Hence, my follow-up email
to think about it some more. :)

It's clear how I need to proceed now. I think we can go ahead and get
your diff in as well. Then, getting got_repo_open adjustments can begin.

Thanks.

> -----------------------------------------------
> commit bd9b38c590a5a0c4e6b47264b667fc5e3e6c4b93 (pack-opentemp)
> from: Stefan Sperling <stsp@stsp.name>
> date: Sat Mar 12 12:43:24 2022 UTC
>  
>  open temporary files needed for delta application in got_repo_open()
>  
>  This prepares for callers of got_repo_open() that cannot afford to
>  open files in /tmp, such as gotwebd. In a follow-up change, we could
>  ask such callers to pass in the required amount of open temporary files.
>  
>  One consequence is that got_repo_open() now requires the "cpath" pledge
>  promise. Add the "cpath" promise to affected callers and remove it once
>  the repository has been opened.
>  
> diff f95923a581dde77a840de5e9f56060e84118b413 9eff67c819a8cc0b0c4db3579999877cde09b335
> blob - 17d637cabb59cb76c220d33c371c8c2a25da94cf
> blob + fbda8cac06e23e0d902b849b50b650436a9dcc11
> --- got/got.c
> +++ got/got.c
> @@ -5810,15 +5810,9 @@ cmd_ref(int argc, char *argv[])
>  		got_path_strip_trailing_slashes(refname);
>  
>  #ifndef PROFILE
> -	if (do_list) {
> -		if (pledge("stdio rpath wpath flock proc exec sendfd unveil",
> -		    NULL) == -1)
> -			err(1, "pledge");
> -	} else {
> -		if (pledge("stdio rpath wpath cpath fattr flock proc exec "
> -		    "sendfd unveil", NULL) == -1)
> -			err(1, "pledge");
> -	}
> +	if (pledge("stdio rpath wpath cpath fattr flock proc exec "
> +	    "sendfd unveil", NULL) == -1)
> +		err(1, "pledge");
>  #endif
>  	cwd = getcwd(NULL, 0);
>  	if (cwd == NULL) {
> @@ -5852,6 +5846,15 @@ cmd_ref(int argc, char *argv[])
>  	if (error != NULL)
>  		goto done;
>  
> +#ifndef PROFILE
> +	if (do_list) {
> +		/* Remove "cpath" promise. */
> +		if (pledge("stdio rpath wpath flock proc exec sendfd unveil",
> +		    NULL) == -1)
> +			err(1, "pledge");
> +	}
> +#endif
> +
>  	error = apply_unveil(got_repo_get_path(repo), do_list,
>  	    worktree ? got_worktree_get_root_path(worktree) : NULL);
>  	if (error)
> @@ -6178,48 +6181,51 @@ cmd_branch(int argc, char *argv[])
>  		usage_branch();
>  
>  #ifndef PROFILE
> +	if (pledge("stdio rpath wpath cpath fattr flock proc exec "
> +	    "sendfd unveil", NULL) == -1)
> +		err(1, "pledge");
> +#endif
> +	cwd = getcwd(NULL, 0);
> +	if (cwd == NULL) {
> +		error = got_error_from_errno("getcwd");
> +		goto done;
> +	}
> +
> +	if (repo_path == NULL) {
> +		error = got_worktree_open(&worktree, cwd);
> +		if (error && error->code != GOT_ERR_NOT_WORKTREE)
> +			goto done;
> +		else
> +			error = NULL;
> +		if (worktree) {
> +			repo_path =
> +			    strdup(got_worktree_get_repo_path(worktree));
> +			if (repo_path == NULL)
> +				error = got_error_from_errno("strdup");
> +			if (error)
> +				goto done;
> +		} else {
> +			repo_path = strdup(cwd);
> +			if (repo_path == NULL) {
> +				error = got_error_from_errno("strdup");
> +				goto done;
> +			}
> +		}
> +	}
> +
> +	error = got_repo_open(&repo, repo_path, NULL);
> +	if (error != NULL)
> +		goto done;
> +
> +#ifndef PROFILE
>  	if (do_list || do_show) {
> +		/* Remove "cpath" promise. */
>  		if (pledge("stdio rpath wpath flock proc exec sendfd unveil",
>  		    NULL) == -1)
>  			err(1, "pledge");
> -	} else {
> -		if (pledge("stdio rpath wpath cpath fattr flock proc exec "
> -		    "sendfd unveil", NULL) == -1)
> -			err(1, "pledge");
>  	}
>  #endif
> -	cwd = getcwd(NULL, 0);
> -	if (cwd == NULL) {
> -		error = got_error_from_errno("getcwd");
> -		goto done;
> -	}
>  
> -	if (repo_path == NULL) {
> -		error = got_worktree_open(&worktree, cwd);
> -		if (error && error->code != GOT_ERR_NOT_WORKTREE)
> -			goto done;
> -		else
> -			error = NULL;
> -		if (worktree) {
> -			repo_path =
> -			    strdup(got_worktree_get_repo_path(worktree));
> -			if (repo_path == NULL)
> -				error = got_error_from_errno("strdup");
> -			if (error)
> -				goto done;
> -		} else {
> -			repo_path = strdup(cwd);
> -			if (repo_path == NULL) {
> -				error = got_error_from_errno("strdup");
> -				goto done;
> -			}
> -		}
> -	}
> -
> -	error = got_repo_open(&repo, repo_path, NULL);
> -	if (error != NULL)
> -		goto done;
> -
>  	error = apply_unveil(got_repo_get_path(repo), do_list,
>  	    worktree ? got_worktree_get_root_path(worktree) : NULL);
>  	if (error)
> @@ -6726,15 +6732,9 @@ cmd_tag(int argc, char *argv[])
>  	tag_name = argv[0];
>  
>  #ifndef PROFILE
> -	if (do_list) {
> -		if (pledge("stdio rpath wpath flock proc exec sendfd unveil",
> -		    NULL) == -1)
> -			err(1, "pledge");
> -	} else {
> -		if (pledge("stdio rpath wpath cpath fattr flock proc exec "
> -		    "sendfd unveil", NULL) == -1)
> -			err(1, "pledge");
> -	}
> +	if (pledge("stdio rpath wpath cpath fattr flock proc exec "
> +	    "sendfd unveil", NULL) == -1)
> +		err(1, "pledge");
>  #endif
>  	cwd = getcwd(NULL, 0);
>  	if (cwd == NULL) {
> @@ -6768,6 +6768,12 @@ cmd_tag(int argc, char *argv[])
>  		error = got_repo_open(&repo, repo_path, NULL);
>  		if (error != NULL)
>  			goto done;
> +#ifndef PROFILE
> +		/* Remove "cpath" promise. */
> +		if (pledge("stdio rpath wpath flock proc exec sendfd unveil",
> +		    NULL) == -1)
> +			err(1, "pledge");
> +#endif
>  		error = apply_unveil(got_repo_get_path(repo), 1, NULL);
>  		if (error)
>  			goto done;
> @@ -11960,7 +11966,7 @@ cmd_info(int argc, char *argv[])
>  	argv += optind;
>  
>  #ifndef PROFILE
> -	if (pledge("stdio rpath wpath flock proc exec sendfd unveil",
> +	if (pledge("stdio rpath wpath cpath flock proc exec sendfd unveil",
>  	    NULL) == -1)
>  		err(1, "pledge");
>  #endif
> @@ -11977,6 +11983,12 @@ cmd_info(int argc, char *argv[])
>  		goto done;
>  	}
>  
> +#ifndef PROFILE
> +	/* Remove "cpath" promise. */
> +	if (pledge("stdio rpath wpath flock proc exec sendfd unveil",
> +	    NULL) == -1)
> +		err(1, "pledge");
> +#endif
>  	error = apply_unveil(NULL, 0, got_worktree_get_root_path(worktree));
>  	if (error)
>  		goto done;
> blob - bd7343dcca754f398aaba30df74ecd27b90d2ee7
> blob + 0f9c2137d554258de7f66a0bcdb8c4a945cea69a
> --- gotadmin/gotadmin.c
> +++ gotadmin/gotadmin.c
> @@ -295,7 +295,7 @@ cmd_info(int argc, char *argv[])
>  	argv += optind;
>  
>  #ifndef PROFILE
> -	if (pledge("stdio rpath wpath flock proc exec sendfd unveil",
> +	if (pledge("stdio rpath wpath cpath flock proc exec sendfd unveil",
>  	    NULL) == -1)
>  		err(1, "pledge");
>  #endif
> @@ -307,7 +307,12 @@ cmd_info(int argc, char *argv[])
>  	error = got_repo_open(&repo, repo_path, NULL);
>  	if (error)
>  		goto done;
> -
> +#ifndef PROFILE
> +	/* Remove "cpath" promise. */
> +	if (pledge("stdio rpath wpath flock proc exec sendfd unveil",
> +	    NULL) == -1)
> +		err(1, "pledge");
> +#endif
>  	error = apply_unveil(got_repo_get_path_git_dir(repo), 1);
>  	if (error)
>  		goto done;
> @@ -899,14 +904,19 @@ cmd_listpack(int argc, char *argv[])
>  		return got_error_from_errno2("realpath", argv[0]);
>  
>  #ifndef PROFILE
> -	if (pledge("stdio rpath wpath flock proc exec sendfd unveil",
> +	if (pledge("stdio rpath wpath cpath flock proc exec sendfd unveil",
>  	    NULL) == -1)
>  		err(1, "pledge");
>  #endif
>  	error = got_repo_open(&repo, packfile_path, NULL);
>  	if (error)
>  		goto done;
> -
> +#ifndef PROFILE
> +	/* Remove "cpath" promise. */
> +	if (pledge("stdio rpath wpath flock proc exec sendfd unveil",
> +	    NULL) == -1)
> +		err(1, "pledge");
> +#endif
>  	error = apply_unveil(got_repo_get_path_git_dir(repo), 1);
>  	if (error)
>  		goto done;
> blob - e8fb373e287ee80486d50ed07964d9d39924308d
> blob + 61f0c6f6be1ff738e75c196b532adeec0d286f6d
> --- lib/got_lib_pack.h
> +++ lib/got_lib_pack.h
> @@ -21,6 +21,8 @@ struct got_pack {
>  	uint8_t *map;
>  	size_t filesize;
>  	struct got_privsep_child *privsep_child;
> +	int basefd;
> +	int accumfd;
>  	int child_has_tempfiles;
>  	int child_has_delta_outfd;
>  	struct got_delta_cache *delta_cache;
> blob - 8218b2c4c146e832c63f84d4a9f6a43f8b3283f8
> blob + aed88977b89e2cbc403071fd2d8fd81478ad6570
> --- lib/object.c
> +++ lib/object.c
> @@ -172,7 +172,7 @@ static const struct got_error *
>  pack_child_send_tempfiles(struct imsgbuf *ibuf, struct got_pack *pack)
>  {
>  	const struct got_error *err;
> -	int basefd, accumfd;
> +	int basefd = -1, accumfd = -1;
>  
>  	/* 
>  	 * For performance reasons, the child will keep reusing the
> @@ -183,23 +183,29 @@ pack_child_send_tempfiles(struct imsgbuf *ibuf, struct
>  	if (pack->child_has_tempfiles)
>  		return NULL;
>  
> -	basefd = got_opentempfd();
> +	basefd = dup(pack->basefd);
>  	if (basefd == -1)
> -		return got_error_from_errno("got_opentempfd");
> +		return got_error_from_errno("dup");
>  
> +	accumfd = dup(pack->accumfd);
> +	if (accumfd == -1) {
> +		err = got_error_from_errno("dup");
> +		goto done;
> +	}
> +
>  	err = got_privsep_send_tmpfd(ibuf, basefd);
>  	if (err)
> -		return err;
> +		goto done;
>  
> -	accumfd = got_opentempfd();
> -	if (accumfd == -1)
> -		return got_error_from_errno("got_opentempfd");
> -
>  	err = got_privsep_send_tmpfd(ibuf, accumfd);
> -	if (err)
> -		return err;
> -
> -	pack->child_has_tempfiles = 1;
> +done:
> +	if (err) {
> +		if (basefd != -1)
> +			close(basefd);
> +		if (accumfd != -1)
> +			close(accumfd);
> +	} else
> +		pack->child_has_tempfiles = 1;
>  	return NULL;
>  }
>  
> blob - 4aa9467c86b6e0a5f571584951e24abda6d8f5f5
> blob + efe03f932898f14a489edd1540f83f72b472413f
> --- lib/pack.c
> +++ lib/pack.c
> @@ -751,6 +751,11 @@ got_pack_close(struct got_pack *pack)
>  		pack->delta_cache = NULL;
>  	}
>  
> +	/*
> +	 * Leave accumfd and basefd alone. They are managed by the
> +	 * repository layer and can be reused.
> +	 */
> +
>  	return err;
>  }
>  
> blob - d4175545865a20ba9a3cff43736298aaae5e8c37
> blob + 487889ce6084b085c5bd704c6a005d435c1992de
> --- lib/repository.c
> +++ lib/repository.c
> @@ -50,6 +50,7 @@
>  #include "got_path.h"
>  #include "got_cancel.h"
>  #include "got_object.h"
> +#include "got_opentemp.h"
>  
>  #include "got_lib_delta.h"
>  #include "got_lib_inflate.h"
> @@ -701,6 +702,19 @@ got_repo_open(struct got_repository **repop, const cha
>  	repo->pack_cache_size = GOT_PACK_CACHE_SIZE;
>  	if (repo->pack_cache_size > rl.rlim_cur / 8)
>  		repo->pack_cache_size = rl.rlim_cur / 8;
> +	for (i = 0; i < nitems(repo->packs); i++) {
> +		if (i < repo->pack_cache_size) {
> +			repo->packs[i].basefd = got_opentempfd();
> +			if (repo->packs[i].basefd == -1)
> +				return got_error_from_errno("got_opentempfd");
> +			repo->packs[i].accumfd = got_opentempfd();
> +			if (repo->packs[i].accumfd == -1)
> +				return got_error_from_errno("got_opentempfd");
> +		} else {
> +			repo->packs[i].basefd = -1;
> +			repo->packs[i].accumfd = -1;
> +		}
> +	}
>  
>  	repo_path = realpath(path, NULL);
>  	if (repo_path == NULL) {
> @@ -784,6 +798,16 @@ got_repo_close(struct got_repository *repo)
>  		if (repo->packs[i].path_packfile == NULL)
>  			break;
>  		got_pack_close(&repo->packs[i]);
> +		if (repo->packs[i].basefd != -1) {
> +			if (close(repo->packs[i].basefd) == -1 && err == NULL)
> +				err = got_error_from_errno("close");
> +			repo->packs[i].basefd = -1;
> +		}
> +		if (repo->packs[i].accumfd != -1) {
> +			if (close(repo->packs[i].accumfd) == -1 && err == NULL)
> +				err = got_error_from_errno("close");
> +			repo->packs[i].accumfd = -1;
> +		}
>  	}
>  
>  	free(repo->path);
> @@ -1360,6 +1384,10 @@ got_repo_cache_pack(struct got_pack **packp, struct go
>  		err = got_pack_close(&repo->packs[i - 1]);
>  		if (err)
>  			return err;
> +		if (ftruncate(repo->packs[i - 1].basefd, 0L) == -1)
> +			return got_error_from_errno("ftruncate");
> +		if (ftruncate(repo->packs[i - 1].accumfd, 0L) == -1)
> +			return got_error_from_errno("ftruncate");
>  		memmove(&repo->packs[1], &repo->packs[0],
>  		    sizeof(repo->packs) - sizeof(repo->packs[0]));
>  		i = 0;
> 

-- 

Tracey Emery