From: Stefan Sperling Subject: Re: change got_worktree_init, open_worktree to use fds To: Yang Zhong Cc: gameoftrees@openbsd.org Date: Wed, 2 Dec 2020 00:11:10 +0100 On Tue, Dec 01, 2020 at 02:39:06PM -0800, Yang Zhong wrote: > Here are updated changes. I've attached them this time. > diff_1 contains the revised changes from before. Thank you! The first patch is looking good now. But please bear with me; I'm still learning to wrap my head around the capsicum side of things and have another question: > > Since you're adding 'int root_fd' to struct got_worktree I don't see a > > problem with adding a corresponding argument to got_worktree_open() in > > order to initialize the root_fd struct member. > > diff_2 contains the changes needed for this. It's a lot of the same stuff, so > separating it out makes it easier to read. Reading this patch, and re-reading your proof-of-concept patch, I am now wondering if you're maybe entering the capsicum sandbox earlier than strictly necessary? You're trying to enable the capsicum sandbox in the same spots as Got enables pledge/unveil on OpenBSD. As a result, we cannot open the file descriptor for 'cwd' in got_worktree_open() internally, which looks a bit odd if we leave the capsicum parts out of the picture in upstream code. I wonder if perhaps the sweet spot for the capsicum sandbox could be a bit later than it is for pledge/unveil, given the differences in their respective operational models? What got_worktree_open() and got_repo_open() do is search the filesystem upwards and look for meta-data files that were created locally, never downloaded. A filesystem crawl involving data which is fully under control of the application doesn't seem like a big risk vector worth protecting. I wouldn't even worry if pledge/unveil weren't enabled at that point in time; they're just being enabled as early as is feasible. You'll definitely want capsicum enabled as soon as the code starts handling repository data which was downloaded from a network. But this is generally only done after work tree and repository have been identified and the corresponding data structures have been set up. Couldn't you apply capsicum to open file descriptors provided via newly initialized struct got_worktree and struct got_repository, and enter the sandbox before the main operation logic of the got command starts executing? If so, then we may be able to keep a lot of these early function calls as-is since e.g. got_worktree_open could search upwards for a work tree without being restricted by the sandbox, and present a file descriptor which capsicum could lock down before we proceed into real action. Would this work? We could then worry about the more interesting and exposed parts of the application next, where the additional protection is definitely desirable. Cheers, Stefan