"GOT", but the "O" is a cute, smiling pufferfish. Index | Thread | Search

From:
"Omar Polo" <op@omarpolo.com>
Subject:
Re: improve gotwebd accept behaviour under load
To:
Stefan Sperling <stsp@stsp.name>
Cc:
gameoftrees@openbsd.org
Date:
Mon, 22 Sep 2025 21:22:15 +0200

Download raw body.

Thread
Stefan Sperling <stsp@stsp.name> wrote:
> The gotwebd sockets process no longer needs 6 file descriptors
> per open connection. I'm not even sure where this number came from.

the usual comment around this is something about how "libevent will
hount us here too".  I guess it (used to?) allocate file descriptor, or
maybe because the application from where this check originates from
needed to make sure there were enough room for fds.

a cursory read on libevent/kqueue.c doesn't seem there's fd allocation
beside that in kqueue_init(), so my first idea is wrong.

> httpd's limit is higher than this, which means it will try to
> queue more requests than gotwebd will accept and start returning
> status 500 when it can no longer connect to gotwebd.
> 
> At some point I tweaked the accept timout to retry exactly once
> per second, by adding a check for an already pending timeout.
> To match httpd's behaviour, reschedule the timeout for an entire
> second each time we hit EMFILE. Not sure if this is very important
> but it is nice to be consistent.
> 
> Testing this diff while got.g.o is being hammered with requests I see
> gotwebd handle requests slowly but gracefully under load.
> without the diff it quickly falls into an endless EMFILE accept loop
> which results in httpd returning status 500 imediately.
> 
> I don't think this will be a 100% solution since our request timeout
> is quite long (120 seconds), giving httpd plenty of time to queue up
> too many requests. Unfortunately some requests can really take a long
> time (blaming huge files with deep history) so I do not want to lower
> this timeout. In the future, perhaps we should make this timeout
> configurable and/or adjust it based on the nature of the request.
> 
> Also drop the separate accept timeout handler. It is redundant since it
> does the same as the EV_TIMEOUT case at the top of the regular handler.
> 
> ok?

haven't tested (yet!) in a production environment, but if it builds,
ship it! :P

ok op@

> M  gotwebd/gotwebd.h  |  0+   1-
> M  gotwebd/sockets.c  |  3+  14-
> 
> 2 files changed, 3 insertions(+), 15 deletions(-)
> 
> commit - 12c1bbcab3809ec364d34a8280dfb318a2968da6
> commit + 8d39e68c89e9b071c7add3674a580b99e065e76d