From: Stefan Sperling Subject: Re: gotwebd: urldecode querystring To: Omar Polo Cc: gameoftrees@openbsd.org Date: Sat, 3 Sep 2022 12:01:06 +0200 On Sat, Sep 03, 2022 at 11:27:57AM +0200, Omar Polo wrote: > As a safety guard I'm rejecting the %00 sequence which, while it's > perfectly valid way of encoding '\0', shouldn't be present on any > gotwebd query parameter. Could we be even stricter than this? For example, is there any reason for non-ASCII characters to be present in the decoded query string, unless the data is a path? > +/* percent-decode `val' in place. */ > static const struct got_error * > +gotweb_urldecode(char *val) > +{ > + for (; *val; ++val) { > + if (*val != '%') > + continue; > + > + if (!isxdigit((unsigned char)val[1]) || > + !isxdigit((unsigned char)val[2]) || > + (val[1] == '0' && val[2] == '0')) > + return got_error(GOT_ERR_BAD_QUERYSTRING); > + > + if (sscanf(val + 1, "%2hhx", val) != 1) > + return got_error(GOT_ERR_BAD_QUERYSTRING); > + memmove(val + 1, val + 3, strlen(val + 3) + 1); /* NUL */ Could we manually parse and translate the hex value instead of using scanf()? See url_decode() in usr.sbin/httpd/httpd.c for example. > +}