"GOT", but the "O" is a cute, smiling sun Index | Thread

From:
Stefan Sperling <stsp@stsp.name>
Subject:
gotd(8) imported, and where this is going
To:
gameoftrees@openbsd.org
Date:
Sun, 23 Oct 2022 13:38:44 +0200

Download raw body.

I imported gotd(8) last night, after working on it whenever I found
some spare time during the past couple of weeks. If you would like
to help with further development of gotd, please keep reading.

The commit message explains context:
https://git.gameoftrees.org/gitweb/?p=got.git;a=commitdiff;h=13b2bc37

As implied in the commit message, this is nowhere near done. It is just
an initial attempt at having a server code base we can start playing
around with. While this code lives on the main branch now, and I will
keep updating the devel/got port with releases from this branch, I will
*not* ship a gotd binary package until we are confident that the design
has settled and the code is ready for wider community testing.

I recommend that -portable releases follow the same approach, i.e.
ship the code but don't compile it by default. And provide a way to
build it out of the got-portable.git repository for interested parties.

Eventually, we should be confident that running gotd won't break anyone's
repositories (as a rule of thumb, if you break or lose someone's repository
just once, they will stop using your version control system).
And we must be confident that running gotd won't expose an interesting
attack surface, especially when serving anonymous fetches on the internet.
In the long term, the more people can confidently run gotd servers for
private or public small-scale use, the better the software will become.
I believe this software fits into a niche which the native Git ecosystem,
with all its focus on the web and big hosting, does not serve very well.

We can take as much time as we need to get it done. There is no need to
rush, no need to make promises, no need to ship it before we want to.
As far as I'm concerned, we can keep iterating on this for years. What
matters is the end result, not time-to-market (we are late already ;)

Regarding the future:

It is important to start on a regression test suite now, before we add
many changes on top of the basic implementation. This will be my next
step, and I would be happy to get help with this. If we can find a few
people who want to write test cases in parallel, it can be done quickly.

I received some questions and input from deraadt@, which is very helpful.
I have never written a privsep daemon before, and this shows. There are
some things the current design gets quite wrong. I am paraphrasing Theo's
suggestions below. It might not be 100% what Theo had in mind, but this
conversation is still fresh, and any misunderstandings on my part can
be cleared up in further discussion.

- gotsh(1) should not be required as a login shell.
This has already been fixed on the main branch. It is now possible to
invoke gotsh as git-receive-pack or git-upload-pack. Users who want to
keep their regular login shell can drop gotsh under those names somewhere
in $PATH where ssh can find them.

- combining the "senfd" and "recvfd" pledge(2) promises is bad
This happened because I went back-and-forth on fd passing during development.
It should be possible to fix this by passing relevant fds differently.

- authentication should not rely on gotd.sock path permissions alone
There is currently no code which the legitimacy of anyone who is talking
to the gotd unix socket.
I've left this out for now because it is not required to develop basic Git
functionality, but this issue is of great concern for multi-user systems.
We will need a separate process which listens on the socket and rate-limits
new connections. And a separate process which will allow/deny connections.
My plan is to require allowed users/groups to be explicitly listed in the
configuration file, on a per-repository basis to keep things simple (config
file macros can prevent excessive repetition).

- fork+exec needs to occur per session, not just once at startup
Here is my evident lack of privsep/ASLR experience. If every session uses
the same address space, any data leak becomes an oracle about address
space layout for subsequent attempts. We will need to fork+exec per session,
which ties in nicely with the authentication requirement above.
After authentication, we could spawn a fresh instance of gotd and a
corresponding reader or writer process to serve the session.
This also solves my problem of wanting to run multiple readers/writers to
serve clients in parallel, which is impossible in the initial implementation.

- chroot is unnecessary on OpenBSD and requires root
At startup, gotd needs root privileges to open its unix socket, and currently
also to move forked reader/writer processes into a per-repository chroot.
With on-demand fork+exec, putting new processes into a chroot will require
root not just at start-up, but also at run-time. On OpenBSD we can use unveil
as a root-less alternative. The current chroot mechanism will be moved to
-portable, where the process which forks children for a new session will
have to keep running as root, unless the target OS provides unveil or can
somehow emulate what unveil achieves on OpenBSD.