"GOT", but the "O" is a cute, smiling pufferfish. Index | Thread | Search

From:
Stefan Sperling <stsp@stsp.name>
Subject:
Re: gotwebd: urldecode querystring
To:
Omar Polo <op@omarpolo.com>
Cc:
gameoftrees@openbsd.org
Date:
Sat, 3 Sep 2022 12:01:06 +0200

Download raw body.

Thread
On Sat, Sep 03, 2022 at 11:27:57AM +0200, Omar Polo wrote:
> As a safety guard I'm rejecting the %00 sequence which, while it's
> perfectly valid way of encoding '\0', shouldn't be present on any
> gotwebd query parameter.

Could we be even stricter than this?
For example, is there any reason for non-ASCII characters to be present
in the decoded query string, unless the data is a path?

> +/* percent-decode `val' in place. */
>  static const struct got_error *
> +gotweb_urldecode(char *val)
> +{
> +	for (; *val; ++val) {
> +		if (*val != '%')
> +			continue;
> +
> +		if (!isxdigit((unsigned char)val[1]) ||
> +		    !isxdigit((unsigned char)val[2]) ||
> +		    (val[1] == '0' && val[2] == '0'))
> +			return got_error(GOT_ERR_BAD_QUERYSTRING);
> +
> +		if (sscanf(val + 1, "%2hhx", val) != 1)
> +			return got_error(GOT_ERR_BAD_QUERYSTRING);
> +		memmove(val + 1, val + 3, strlen(val + 3) + 1); /* NUL */

Could we manually parse and translate the hex value instead of using scanf()?
See url_decode() in usr.sbin/httpd/httpd.c for example.

> +}