Download raw body.
gotwebd: percent-encode querystrings
On 2022/09/06 18:50:03 +0200, Stefan Sperling <stsp@stsp.name> wrote: > On Tue, Sep 06, 2022 at 05:12:14PM +0200, Omar Polo wrote: > > here's a rebased and slightly improved diff. it makes gotwebd gains a > > few lines of code but I think it's better than revisit every function > > where we print a link and allocate yet another local string that we > > might forget to free and making the output functions even more complex > > to follow. it also centralize how we generate URLs, hopefully making > > the life easier in the future if we want to change things. > > > > the changes to the previous version are: > > > > - escape the `headref' parametr too > > - use a consistent ordering of the fields > > - add a comment before the gotweb_url struct > > This seems much saner than what we had before. ok by me, though > I have no time to test right now. > > Should we also refuse attempts to encode control-characters into URLs > (byte values between 1 and 31, excluding 9 which is '\t')? > See https://daniel.haxx.se/blog/2022/09/05/a-bug-that-was-23-years-old-or-not/ > for reasons why we might want to block them. Thanks for linking the post! It was interesting to read, i wasn't aware of that issue. It doesn't directly touches us as the request is parsed by httpd(8) (or whatever frontend people are using) that sends us the data parsed via fastcgi params. I'm expecting the web server to catch and deal with this kinds of issues. I don't think it's worth adding stuff to gotwebd to add this case -- we don't even use cookies :D -- but however what we could do is to: - reject the request if the provided hostname has funny characters in it don't really see a point in it, we just read the provided SERVER_NAME fastcgi parameter and match it against the list of servers. but it's easy to do if wanted. - reject the request if the querystring has invalid characters in it this has nothing to do with the linked post, but I think that gotweb_urldecode should error out on invalid characters in the querystring. I haven't thought about it when first writing the decoder but now I'm quite confident it should fail. Will look into it. > A similar issue might affect the decoder (I haven't looked); if that > is the case then we could handle control chars in follow-up commits. This issue as only to do with the decoder, the only headers we're using in the reply are Content-Type, Content-Disposition and Content-Security-Policy and none of those can ever end up having non-hardcoded values.
gotwebd: percent-encode querystrings