"GOT", but the "O" is a cute, smiling pufferfish. Index | Thread | Search

From:
Stefan Sperling <stsp@stsp.name>
Subject:
Re: change got_worktree_init, open_worktree to use fds
To:
Yang Zhong <yzhong@freebsdfoundation.org>
Cc:
Ed Maste <emaste@freebsd.org>, gameoftrees@openbsd.org
Date:
Fri, 4 Dec 2020 22:24:41 +0100

Download raw body.

Thread
On Fri, Dec 04, 2020 at 09:00:00AM -0800, Yang Zhong wrote:
> I'd like clarification on a few points: If I have a path that I know is
> relative to one of our three fds (the worktree, the repo, /tmp), as
> well as the fd itself, is that considered a relative or absolute path?

Absolute paths in this context are paths that begin with a slash.
I'll elaborate further below.

> Another question, sort of a continuation of the first:
> From what I know about the code, I think that files under these
> main directories will always stay underneath them. ie. if I have
> a path to a file in the worktree, the path to the worktree itself
> within that path won't ever change. Is this true?

When operating on a work tree, Got should never touch filesystem paths
outside this work tree. The same applies when operating on a repository.
Whether this is remains true at run-time depends on Got code being free
of bugs. Generally, tricking a version control system into using paths
outside of the directories it is managing would be considered a security
issue. Mechanisms like unveil and capsicum will act as a safety net in
case of such bugs. But the application code should still be correct, i.e.
never operate on a path outside the managed path space (or /tmp).

> I'm having some trouble articulating what I mean. My point is
> that it appears reasonable to replace truly absolute paths
> ("/one/two/three") with a pair of fd and relative path (an fd from
> open("/one/two"), "three").

Yes, these are indeed two different representations of the same thing.
In theory, we could replace every function that looks like this:

	func(const char *abspath);

with a function that looks like this:

	func(int dirfd, const char *relpath);

However, the problem we're trying to solve is to find a small set of
changes required to make capsicum work. Changing every function that accepts
a path into a function that accepts both an fd and a relpath is one solution,
but it is not an ideal solution because it is not a small change.
Functions such as got_path_is_child(), got_canonpath(), got_path_cmp() etc.
operate on absolute paths, not fds. As a hypothetical example, it would be
prohibitively awkward to force fd arguments onto such functions in order
to add capsicum support to the application.

As far as I understand, what capsicum will generally need are 3 file
descriptors for the 3 main directories which Got will be using:

	the work tree, the repository, and /tmp

For each of these, a corresponding absolute path can be obtained:

  worktree_fd : abspath is returned by got_worktree_get_root_path()
  repository_fd : abspath is returned by got_repo_get_path()
  tmpdir_fd: abspath is stored in the constant GOT_TMPDIR

What I hope is that you can keep most of the existing code dealing with
absolute paths, and only convert to relative paths on demand if required.

(I also hope that capsicum will be OK with multiple path components in the
path argument of openat(), like this: openat(open("/one"), "two/three");
If not, an fd for every intermediate directory would be required, which
would be very awkward to handle.)

Now, going back to your example: openat(open("/one/two"), "three")

We want to make the above call to openat() and we have the absolute path
"/one/two/tree". And as additional info we have a directory fd and the
absolute path to this directory.

This function call will store the relative path "three" in 'child':

  got_path_skip_common_ancestor(&child, "/one/two", "/one/two/three");

A more realistic example:

  err = got_path_skip_common_ancestor(&child,
      got_worktree_get_root_path(worktree), abspath_in_worktree);
  if (err)
	goto done;
  if (openat(worktree_fd, child, ...) == -1) {
	err = got_error_from_errno2("openat", abspath_in_worktree);
	free(child);
	goto done;
  }
  free(child);

We can derive relative paths from absolute ones in this way.
Any function which has access to struct got_worktree and an absolute path
that points inside this worktree should be able to use openat() to open the
corresponding file. Provided an fd and worktree root abspath are both stored
in struct got_worktree. Similarly for got_repository and the tempdir.

Does this make sense?

Regards,
Stefan