"GOT", but the "O" is a cute, smiling pufferfish. Index | Thread | Search

From:
Omar Polo <op@omarpolo.com>
Subject:
Re: gotwebd: percent-encode querystrings
To:
Stefan Sperling <stsp@stsp.name>
Cc:
gameoftrees@openbsd.org
Date:
Tue, 06 Sep 2022 23:52:16 +0200

Download raw body.

Thread
On 2022/09/06 18:50:03 +0200, Stefan Sperling <stsp@stsp.name> wrote:
> On Tue, Sep 06, 2022 at 05:12:14PM +0200, Omar Polo wrote:
> > here's a rebased and slightly improved diff.  it makes gotwebd gains a
> > few lines of code but I think it's better than revisit every function
> > where we print a link and allocate yet another local string that we
> > might forget to free and making the output functions even more complex
> > to follow.  it also centralize how we generate URLs, hopefully making
> > the life easier in the future if we want to change things.
> > 
> > the changes to the previous version are:
> > 
> >  - escape the `headref' parametr too
> >  - use a consistent ordering of the fields
> >  - add a comment before the gotweb_url struct
> 
> This seems much saner than what we had before. ok by me, though
> I have no time to test right now.
> 
> Should we also refuse attempts to encode control-characters into URLs
> (byte values between 1 and 31, excluding 9 which is '\t')?
> See https://daniel.haxx.se/blog/2022/09/05/a-bug-that-was-23-years-old-or-not/
> for reasons why we might want to block them.

Thanks for linking the post!  It was interesting to read, i wasn't
aware of that issue.

It doesn't directly touches us as the request is parsed by httpd(8)
(or whatever frontend people are using) that sends us the data parsed
via fastcgi params.  I'm expecting the web server to catch and deal
with this kinds of issues.

I don't think it's worth adding stuff to gotwebd to add this case --
we don't even use cookies :D -- but however what we could do is to:

 - reject the request if the provided hostname has funny characters in
   it

   don't really see a point in it, we just read the provided
   SERVER_NAME fastcgi parameter and match it against the list of
   servers.  but it's easy to do if wanted.

 - reject the request if the querystring has invalid characters in it

   this has nothing to do with the linked post, but I think that
   gotweb_urldecode should error out on invalid characters in the
   querystring.  I haven't thought about it when first writing the
   decoder but now I'm quite confident it should fail.  Will look into
   it.

> A similar issue might affect the decoder (I haven't looked); if that
> is the case then we could handle control chars in follow-up commits.

This issue as only to do with the decoder, the only headers we're
using in the reply are Content-Type, Content-Disposition and
Content-Security-Policy and none of those can ever end up having
non-hardcoded values.