"GOT", but the "O" is a cute, smiling pufferfish. Index | Thread | Search

From:
Stefan Sperling <stsp@stsp.name>
Subject:
Re: struct for priv_fds
To:
gameoftrees@openbsd.org
Date:
Mon, 30 May 2022 16:35:10 +0200

Download raw body.

Thread
On Mon, May 30, 2022 at 07:55:39AM -0600, Tracey Emery wrote:
> Then, what about something like below as a starting point? An array can
> be loaded up and this would give us an association for each fd.

The overall direction is unclear to me. More below:

> diff d6a28ffe187127e3247254d7e242bb52d66eb26b /home/tracey/src/got
> blob - b6e44e8b40b8476a471c53fd10cffd4a7ff3b32d
> file + include/got_repository.h
> --- include/got_repository.h
> +++ include/got_repository.h
> @@ -177,3 +177,12 @@ const struct got_error *got_repo_get_loose_object_info
>  /* Obtain the number and size of packed objects in the repository. */
>  const struct got_error *got_repo_get_packfile_info(int *npackfiles,
>      int *nobjects, off_t *total_packsize, struct got_repository *);
> +
> +/* Privileged file descriptors */
> +enum got_priv_fds {
> +	DIFF_FD,
> +	BLAME_FD,
> +	PACK_BASE_FD,
> +	PACK_ACCUM_FD,
> +	PRIV_FDS__MAX,
> +};
> 

There was the diff below which I wrote to start moving use of
got_opentemp() out of files in the lib/ directory.

The idea here is to prepare two file descriptors for each running
pack file helper at got_repo_open() time. The number of tempfiles we
keep open is constant, because helpers which die pass their open
tempfiles on to new helpers that take over when we switch to reading
a different pack.

As a future extension to this, we could require callers of got_repo_open()
to pass an array of open tempfiles, instead of having got_repo_open()
call got_opentempfd() internally. The patch below is just the first
step in that direction, since the files are still being opened inside
the function itself.

This patch only addresses the base/accum tempfile situation. But the
same idea could be applied to blame/diff/etc. Over time, all frontends
would be tweaked to provide the necessary tempfiles upfront, depending
on the operation they want to run. And gotwebd would do the same.

The current APIs clearly don't fit with what gotwebd needs, because they
manage resources which must be managed by gotwebd instead. I think a good
way to fix this is to work on pushing each got_opentemp() call further out
to the API boundary. By doing so we learn how many open files the callers
will need to provide in each case.

I don't see why you need the above enum in got_repository.h.
In the end, gotwebd would work just like the other frontends, and call the
lib/ APIs with required arguments, some of which will be open file handles,
or arrays of open file handles.
Maybe gotwebd needs to keep track of a couple of such arrays and use an
enum for that purpose. But could it not keep this enum private instead of
declaring it in got_repository.h, which is pulled in everywhere?

-----------------------------------------------
commit bd9b38c590a5a0c4e6b47264b667fc5e3e6c4b93 (pack-opentemp)
from: Stefan Sperling <stsp@stsp.name>
date: Sat Mar 12 12:43:24 2022 UTC
 
 open temporary files needed for delta application in got_repo_open()
 
 This prepares for callers of got_repo_open() that cannot afford to
 open files in /tmp, such as gotwebd. In a follow-up change, we could
 ask such callers to pass in the required amount of open temporary files.
 
 One consequence is that got_repo_open() now requires the "cpath" pledge
 promise. Add the "cpath" promise to affected callers and remove it once
 the repository has been opened.
 
diff f95923a581dde77a840de5e9f56060e84118b413 9eff67c819a8cc0b0c4db3579999877cde09b335
blob - 17d637cabb59cb76c220d33c371c8c2a25da94cf
blob + fbda8cac06e23e0d902b849b50b650436a9dcc11
--- got/got.c
+++ got/got.c
@@ -5810,15 +5810,9 @@ cmd_ref(int argc, char *argv[])
 		got_path_strip_trailing_slashes(refname);
 
 #ifndef PROFILE
-	if (do_list) {
-		if (pledge("stdio rpath wpath flock proc exec sendfd unveil",
-		    NULL) == -1)
-			err(1, "pledge");
-	} else {
-		if (pledge("stdio rpath wpath cpath fattr flock proc exec "
-		    "sendfd unveil", NULL) == -1)
-			err(1, "pledge");
-	}
+	if (pledge("stdio rpath wpath cpath fattr flock proc exec "
+	    "sendfd unveil", NULL) == -1)
+		err(1, "pledge");
 #endif
 	cwd = getcwd(NULL, 0);
 	if (cwd == NULL) {
@@ -5852,6 +5846,15 @@ cmd_ref(int argc, char *argv[])
 	if (error != NULL)
 		goto done;
 
+#ifndef PROFILE
+	if (do_list) {
+		/* Remove "cpath" promise. */
+		if (pledge("stdio rpath wpath flock proc exec sendfd unveil",
+		    NULL) == -1)
+			err(1, "pledge");
+	}
+#endif
+
 	error = apply_unveil(got_repo_get_path(repo), do_list,
 	    worktree ? got_worktree_get_root_path(worktree) : NULL);
 	if (error)
@@ -6178,48 +6181,51 @@ cmd_branch(int argc, char *argv[])
 		usage_branch();
 
 #ifndef PROFILE
+	if (pledge("stdio rpath wpath cpath fattr flock proc exec "
+	    "sendfd unveil", NULL) == -1)
+		err(1, "pledge");
+#endif
+	cwd = getcwd(NULL, 0);
+	if (cwd == NULL) {
+		error = got_error_from_errno("getcwd");
+		goto done;
+	}
+
+	if (repo_path == NULL) {
+		error = got_worktree_open(&worktree, cwd);
+		if (error && error->code != GOT_ERR_NOT_WORKTREE)
+			goto done;
+		else
+			error = NULL;
+		if (worktree) {
+			repo_path =
+			    strdup(got_worktree_get_repo_path(worktree));
+			if (repo_path == NULL)
+				error = got_error_from_errno("strdup");
+			if (error)
+				goto done;
+		} else {
+			repo_path = strdup(cwd);
+			if (repo_path == NULL) {
+				error = got_error_from_errno("strdup");
+				goto done;
+			}
+		}
+	}
+
+	error = got_repo_open(&repo, repo_path, NULL);
+	if (error != NULL)
+		goto done;
+
+#ifndef PROFILE
 	if (do_list || do_show) {
+		/* Remove "cpath" promise. */
 		if (pledge("stdio rpath wpath flock proc exec sendfd unveil",
 		    NULL) == -1)
 			err(1, "pledge");
-	} else {
-		if (pledge("stdio rpath wpath cpath fattr flock proc exec "
-		    "sendfd unveil", NULL) == -1)
-			err(1, "pledge");
 	}
 #endif
-	cwd = getcwd(NULL, 0);
-	if (cwd == NULL) {
-		error = got_error_from_errno("getcwd");
-		goto done;
-	}
 
-	if (repo_path == NULL) {
-		error = got_worktree_open(&worktree, cwd);
-		if (error && error->code != GOT_ERR_NOT_WORKTREE)
-			goto done;
-		else
-			error = NULL;
-		if (worktree) {
-			repo_path =
-			    strdup(got_worktree_get_repo_path(worktree));
-			if (repo_path == NULL)
-				error = got_error_from_errno("strdup");
-			if (error)
-				goto done;
-		} else {
-			repo_path = strdup(cwd);
-			if (repo_path == NULL) {
-				error = got_error_from_errno("strdup");
-				goto done;
-			}
-		}
-	}
-
-	error = got_repo_open(&repo, repo_path, NULL);
-	if (error != NULL)
-		goto done;
-
 	error = apply_unveil(got_repo_get_path(repo), do_list,
 	    worktree ? got_worktree_get_root_path(worktree) : NULL);
 	if (error)
@@ -6726,15 +6732,9 @@ cmd_tag(int argc, char *argv[])
 	tag_name = argv[0];
 
 #ifndef PROFILE
-	if (do_list) {
-		if (pledge("stdio rpath wpath flock proc exec sendfd unveil",
-		    NULL) == -1)
-			err(1, "pledge");
-	} else {
-		if (pledge("stdio rpath wpath cpath fattr flock proc exec "
-		    "sendfd unveil", NULL) == -1)
-			err(1, "pledge");
-	}
+	if (pledge("stdio rpath wpath cpath fattr flock proc exec "
+	    "sendfd unveil", NULL) == -1)
+		err(1, "pledge");
 #endif
 	cwd = getcwd(NULL, 0);
 	if (cwd == NULL) {
@@ -6768,6 +6768,12 @@ cmd_tag(int argc, char *argv[])
 		error = got_repo_open(&repo, repo_path, NULL);
 		if (error != NULL)
 			goto done;
+#ifndef PROFILE
+		/* Remove "cpath" promise. */
+		if (pledge("stdio rpath wpath flock proc exec sendfd unveil",
+		    NULL) == -1)
+			err(1, "pledge");
+#endif
 		error = apply_unveil(got_repo_get_path(repo), 1, NULL);
 		if (error)
 			goto done;
@@ -11960,7 +11966,7 @@ cmd_info(int argc, char *argv[])
 	argv += optind;
 
 #ifndef PROFILE
-	if (pledge("stdio rpath wpath flock proc exec sendfd unveil",
+	if (pledge("stdio rpath wpath cpath flock proc exec sendfd unveil",
 	    NULL) == -1)
 		err(1, "pledge");
 #endif
@@ -11977,6 +11983,12 @@ cmd_info(int argc, char *argv[])
 		goto done;
 	}
 
+#ifndef PROFILE
+	/* Remove "cpath" promise. */
+	if (pledge("stdio rpath wpath flock proc exec sendfd unveil",
+	    NULL) == -1)
+		err(1, "pledge");
+#endif
 	error = apply_unveil(NULL, 0, got_worktree_get_root_path(worktree));
 	if (error)
 		goto done;
blob - bd7343dcca754f398aaba30df74ecd27b90d2ee7
blob + 0f9c2137d554258de7f66a0bcdb8c4a945cea69a
--- gotadmin/gotadmin.c
+++ gotadmin/gotadmin.c
@@ -295,7 +295,7 @@ cmd_info(int argc, char *argv[])
 	argv += optind;
 
 #ifndef PROFILE
-	if (pledge("stdio rpath wpath flock proc exec sendfd unveil",
+	if (pledge("stdio rpath wpath cpath flock proc exec sendfd unveil",
 	    NULL) == -1)
 		err(1, "pledge");
 #endif
@@ -307,7 +307,12 @@ cmd_info(int argc, char *argv[])
 	error = got_repo_open(&repo, repo_path, NULL);
 	if (error)
 		goto done;
-
+#ifndef PROFILE
+	/* Remove "cpath" promise. */
+	if (pledge("stdio rpath wpath flock proc exec sendfd unveil",
+	    NULL) == -1)
+		err(1, "pledge");
+#endif
 	error = apply_unveil(got_repo_get_path_git_dir(repo), 1);
 	if (error)
 		goto done;
@@ -899,14 +904,19 @@ cmd_listpack(int argc, char *argv[])
 		return got_error_from_errno2("realpath", argv[0]);
 
 #ifndef PROFILE
-	if (pledge("stdio rpath wpath flock proc exec sendfd unveil",
+	if (pledge("stdio rpath wpath cpath flock proc exec sendfd unveil",
 	    NULL) == -1)
 		err(1, "pledge");
 #endif
 	error = got_repo_open(&repo, packfile_path, NULL);
 	if (error)
 		goto done;
-
+#ifndef PROFILE
+	/* Remove "cpath" promise. */
+	if (pledge("stdio rpath wpath flock proc exec sendfd unveil",
+	    NULL) == -1)
+		err(1, "pledge");
+#endif
 	error = apply_unveil(got_repo_get_path_git_dir(repo), 1);
 	if (error)
 		goto done;
blob - e8fb373e287ee80486d50ed07964d9d39924308d
blob + 61f0c6f6be1ff738e75c196b532adeec0d286f6d
--- lib/got_lib_pack.h
+++ lib/got_lib_pack.h
@@ -21,6 +21,8 @@ struct got_pack {
 	uint8_t *map;
 	size_t filesize;
 	struct got_privsep_child *privsep_child;
+	int basefd;
+	int accumfd;
 	int child_has_tempfiles;
 	int child_has_delta_outfd;
 	struct got_delta_cache *delta_cache;
blob - 8218b2c4c146e832c63f84d4a9f6a43f8b3283f8
blob + aed88977b89e2cbc403071fd2d8fd81478ad6570
--- lib/object.c
+++ lib/object.c
@@ -172,7 +172,7 @@ static const struct got_error *
 pack_child_send_tempfiles(struct imsgbuf *ibuf, struct got_pack *pack)
 {
 	const struct got_error *err;
-	int basefd, accumfd;
+	int basefd = -1, accumfd = -1;
 
 	/* 
 	 * For performance reasons, the child will keep reusing the
@@ -183,23 +183,29 @@ pack_child_send_tempfiles(struct imsgbuf *ibuf, struct
 	if (pack->child_has_tempfiles)
 		return NULL;
 
-	basefd = got_opentempfd();
+	basefd = dup(pack->basefd);
 	if (basefd == -1)
-		return got_error_from_errno("got_opentempfd");
+		return got_error_from_errno("dup");
 
+	accumfd = dup(pack->accumfd);
+	if (accumfd == -1) {
+		err = got_error_from_errno("dup");
+		goto done;
+	}
+
 	err = got_privsep_send_tmpfd(ibuf, basefd);
 	if (err)
-		return err;
+		goto done;
 
-	accumfd = got_opentempfd();
-	if (accumfd == -1)
-		return got_error_from_errno("got_opentempfd");
-
 	err = got_privsep_send_tmpfd(ibuf, accumfd);
-	if (err)
-		return err;
-
-	pack->child_has_tempfiles = 1;
+done:
+	if (err) {
+		if (basefd != -1)
+			close(basefd);
+		if (accumfd != -1)
+			close(accumfd);
+	} else
+		pack->child_has_tempfiles = 1;
 	return NULL;
 }
 
blob - 4aa9467c86b6e0a5f571584951e24abda6d8f5f5
blob + efe03f932898f14a489edd1540f83f72b472413f
--- lib/pack.c
+++ lib/pack.c
@@ -751,6 +751,11 @@ got_pack_close(struct got_pack *pack)
 		pack->delta_cache = NULL;
 	}
 
+	/*
+	 * Leave accumfd and basefd alone. They are managed by the
+	 * repository layer and can be reused.
+	 */
+
 	return err;
 }
 
blob - d4175545865a20ba9a3cff43736298aaae5e8c37
blob + 487889ce6084b085c5bd704c6a005d435c1992de
--- lib/repository.c
+++ lib/repository.c
@@ -50,6 +50,7 @@
 #include "got_path.h"
 #include "got_cancel.h"
 #include "got_object.h"
+#include "got_opentemp.h"
 
 #include "got_lib_delta.h"
 #include "got_lib_inflate.h"
@@ -701,6 +702,19 @@ got_repo_open(struct got_repository **repop, const cha
 	repo->pack_cache_size = GOT_PACK_CACHE_SIZE;
 	if (repo->pack_cache_size > rl.rlim_cur / 8)
 		repo->pack_cache_size = rl.rlim_cur / 8;
+	for (i = 0; i < nitems(repo->packs); i++) {
+		if (i < repo->pack_cache_size) {
+			repo->packs[i].basefd = got_opentempfd();
+			if (repo->packs[i].basefd == -1)
+				return got_error_from_errno("got_opentempfd");
+			repo->packs[i].accumfd = got_opentempfd();
+			if (repo->packs[i].accumfd == -1)
+				return got_error_from_errno("got_opentempfd");
+		} else {
+			repo->packs[i].basefd = -1;
+			repo->packs[i].accumfd = -1;
+		}
+	}
 
 	repo_path = realpath(path, NULL);
 	if (repo_path == NULL) {
@@ -784,6 +798,16 @@ got_repo_close(struct got_repository *repo)
 		if (repo->packs[i].path_packfile == NULL)
 			break;
 		got_pack_close(&repo->packs[i]);
+		if (repo->packs[i].basefd != -1) {
+			if (close(repo->packs[i].basefd) == -1 && err == NULL)
+				err = got_error_from_errno("close");
+			repo->packs[i].basefd = -1;
+		}
+		if (repo->packs[i].accumfd != -1) {
+			if (close(repo->packs[i].accumfd) == -1 && err == NULL)
+				err = got_error_from_errno("close");
+			repo->packs[i].accumfd = -1;
+		}
 	}
 
 	free(repo->path);
@@ -1360,6 +1384,10 @@ got_repo_cache_pack(struct got_pack **packp, struct go
 		err = got_pack_close(&repo->packs[i - 1]);
 		if (err)
 			return err;
+		if (ftruncate(repo->packs[i - 1].basefd, 0L) == -1)
+			return got_error_from_errno("ftruncate");
+		if (ftruncate(repo->packs[i - 1].accumfd, 0L) == -1)
+			return got_error_from_errno("ftruncate");
 		memmove(&repo->packs[1], &repo->packs[0],
 		    sizeof(repo->packs) - sizeof(repo->packs[0]));
 		i = 0;