From: Stefan Sperling <stsp@stsp.name>
Subject: Re: New library config parse.y start
To: Tracey Emery <tracey@traceyemery.net>
Cc: gameoftrees@openbsd.org
Date: Sun, 21 Jun 2020 01:21:31 +0200

On Sat, Jun 20, 2020 at 08:45:54AM -0600, Tracey Emery wrote:
> So, yes, it makes sense to not go the direction of these two diffs. Now,
> I know it's a segway, but, thoughts on the gotwebd idea?

I believe what we want to achieve is that gotweb can operate without
creating any files. The idea is that all creation of content which
gets displayed as part of the HTML page would be delegated to a
long-running process instead of the cgi itself.

This daemon can be called gotd or gotwebd, it doesn't matter.
Though I'd imagine that a finished gotd would have many parts, some of
which would be optional. One such optional part could do what gotweb needs,
and could potentially be kept generic enough to be re-usable by programs
other than gotweb. Or it could be a gotweb-specific component of 'gotd'.

What gotweb needs is a way to request the results of high-level operations
offered by got, such as cat, diff, and blame. The deamon would offer a unix
domain socket as its interface. In gotweb's case this socket would be
visible in /var/www somewhere.
Essentially, this "service" would wrap functionality offered by the got
library APIs that gotweb is using right now. I would imagine that we'd
create another collection of imsg data structures and functions for this
purpose. So gotweb could call a new set of functions that look similar to
the one it is calling right now, but which obtain results from the deamon
instead of the library.

It's not going to be very simple. That is a lot of stuff to wrap and marshall
through imsg. The daemon will have to keep track of operations that are in
progress and manage related resources such as temporary files.

Once we have this working in a basic fashion we could enforce rate limiting
in the deamon as well, so that gotweb would have to wait on the pipe for
results if there is already too much load (as determined by the daemon).
This could address the problem where attackers simply spawn a lot of
cgi programs to exhaust resources.

We could probably dock server-side fetch/push functionality onto the same
interface eventually. Recall the idea about server-side rebasing: We'll
eventually need a deamon which will rebase commits on behalf of clients
which are running 'got push'. Such 'commit/rebase' style operations would
also need to be communicated over a pipe, which is a similar problem.
And gotweb's UID wouldn't be allowed to create new commits via this pipe
(see getpeereid(3)). But some other program run by another dedicated user
account which is reachable via SSH would be allowed to do just that.

That's my general idea of how we could make this work, though there are
many details missing.

I suppose a daemon that is able to reproduce basic 'got cat' behaviour
(object ID in, object data out) over a unix domain socket would be a great
starting point for a gotd or gotwebd.