"GOT", but the "O" is a cute, smiling pufferfish. Index | Thread | Search

From:
Stefan Sperling <stsp@stsp.name>
Subject:
Re: gotwebd website support
To:
Omar Polo <op@omarpolo.com>
Cc:
gameoftrees@openbsd.org
Date:
Tue, 2 Dec 2025 09:27:28 +0100

Download raw body.

Thread
On Mon, Dec 01, 2025 at 11:17:39PM +0100, Omar Polo wrote:
> I'm jus leaving one tiny nitpick if you can fix it before merging this
> in main, as i think it was an unwanted change :p
> 
> while playing with it i've found a minor logic error, which is innocuous
> luckily, with `website "/"':
> 
> 	$ curl -sI http://localhost/ | grep Location
> 	Location: http://localhost//index.html

Yes, this is trivial to fix. I have added a call to trim trailing slashes
from the requested URL before adding "/index.html" for the redirect.
 
> in any case, let's get this in and improve in-tree, i think it's the
> best way forward :)
> 
> ok op@

Thanks!

> > [...]
> > --- gotwebd/gotwebd.conf.5
> > +++ gotwebd/gotwebd.conf.5
> > @@ -1,4 +1,4 @@
> > -.\"
> > +\"
> 
> nit: lost the "." :p

Oops, good catch. This was probably me mistyping something into vim.

> > @@ -463,6 +507,56 @@ parameter determines whether
> >  .Xr gotwebd 8
> >  will display the repository.
> >  .El
> > +.It Ic website Ar url-path Brq ...
> > +Show a web site when the browser visits the given
> > +.Ar url-path .
> > +The web site's content is composed of files in a Git repository.
> > +.Pp
> > +While the underlying repository is subject to authentication as usual,
> > +web site content is always public, and cannot be hidden via the
> > +.Ic hide repository
> > +or
> > +.Ic respect_exportok
> > +directives.
> 
> I think we should 'fix' this and at least enable auth for the website.
> I think it doesn't make much sense to protect the history but not the
> resulting tree at the tip :p

I am not sure what value authentication would add to this feature.

The reason we need authentication is either to protect a private repository
or to block web crawlers from hitting expensive pages like diff and blame
over and over, wasting resources.


Regarding private repositories:

The website feature is intended to be used for non-interactive web sites,
such as software project landing pages which introduce a project and link
to other resource related to it. Such pages are usually public and do not
process input beyond parsing GET requests. Putting authentication in front
of them reduces discoverability to a point where it does not really make
much sense to have a landing page in the first place.

The website files might even be stored in a private repo. If so, the admin is
making a concious decision to expose part of that private repo as a web site.
I don't see a problem with that. The exposure is limited to a specific branch
and/or subdirectory, and must be explicitly enabled via 'website' statements
in the config file.


Regarding crawlers:

Serving trees/blobs without further processing is not very expensive overall.
The most expensive part of this is decompression, both in terms of zlib and
deltas. This will usually be fast enough that we won't need to worry about
crawlers. Should this become a problem we can add a persistent in-memory
cache for website content to gotwebd and just copy any relatively small files
out directly from RAM, reducing overhead to the minimum possible with FastCGI.