"GOT", but the "O" is a cute, smiling pufferfish. Index | Thread | Search

From:
Omar Polo <op@omarpolo.com>
Subject:
Re: gotd protected references
To:
Stefan Sperling <stsp@stsp.name>
Cc:
gameoftrees@openbsd.org
Date:
Tue, 04 Apr 2023 19:56:29 +0200

Download raw body.

Thread
On 2023/04/04 13:31:07 +0200, Stefan Sperling <stsp@stsp.name> wrote:
> On Fri, Mar 31, 2023 at 10:34:26PM +0200, Stefan Sperling wrote:
> > On Fri, Mar 31, 2023 at 10:06:23PM +0200, Stefan Sperling wrote:
> > > This patch adds a "protect" directive to gotd.conf which can be used
> > > to forbid 'got send -f' on selected branches/tags.
> > > 
> > > There will always be accidents with 'got send -f' at some point in a
> > > project's life cycle so gotd should provide safeguards against this.
> > 
> > Oh, there is stupid mistake in here. Disregard this for now.
> > 
> > The patch works fine for references which are not supposed to ever change,
> > like tags. But for branches we need a linear ancestry check and only forbid
> > the ref-update when the history being pushed is not linear. This will require
> > more work to get right.
> 
> Fixed diff below.
> 
> The config syntax has changed since branches and tags need to follow
> slightly different rules when protected.
> 
> We now also ensure that new refs created in a protected namspace are of
> the appropriate object type. E.g. it will not longer be possible to add
> non-tag objects under "refs/tags" if it is a protected tag namespace.

some comments inline.  Haven't tested yet, but it reads fine.  we can
address these points in-tree, it's ok op@ as-is.

> --- gotd/parse.y
> +++ gotd/parse.y
> [...]
> @@ -229,6 +239,44 @@ repository	: REPOSITORY STRING {
>  		}
>  		;
>  
> +protect		: PROTECT '{' optnl protectflags_l '}'
> +		| PROTECT protectflags
> +
> +protectflags_l	: protectflags optnl protectflags_l
> +		| protectflags optnl
> +		;
> +
> +protectflags	: TAG NAMESPACE STRING {
> +			if (gotd_proc_id == PROC_GOTD ||
> +			    gotd_proc_id == PROC_REPO_WRITE) {
> +				if (conf_protect_tag_namespace(new_repo, $3)) {
> +					free($3);
> +					YYERROR;
> +				}
> +			}

I think we should free($3) here to avoid leaking memory.

> +		}
> +		| BRANCH NAMESPACE STRING {
> +			if (gotd_proc_id == PROC_GOTD ||
> +			    gotd_proc_id == PROC_REPO_WRITE) {
> +				if (conf_protect_branch_namespace(new_repo,
> +				    $3)) {
> +					free($3);
> +					YYERROR;
> +				}
> +				free($3);

like this but moved outside the if (in all the three rule) to not leak
memory in the other processes.

> +			}
> +		}
> +		| BRANCH STRING {
> +			if (gotd_proc_id == PROC_GOTD ||
> +			    gotd_proc_id == PROC_REPO_WRITE) {
> +				if (conf_protect_branch(new_repo, $2)) {
> +					free($2);
> +					YYERROR;
> +				}
> +			}

and here too.

> +		}
> +		;
> +
> [...]
> @@ -839,6 +897,94 @@ int
>  	STAILQ_INSERT_TAIL(&repo->rules, rule, entry);
>  }
>  
> +static int
> +refname_is_valid(char *refname)
> +{
> +	if (strlen(refname) < 5 || strncmp(refname, "refs/", 5) != 0) {

I think we can avoid the strlen() here: if refname is shorter than
that, strncmp will return non zero.

I know this comes from protect_ref_namespace^W validate_namespace, but
still

> +		yyerror("reference name must begin with \"refs/\": %s",
> +		    refname);
> +		return 0;
> +	}
> +
> +	if (!got_ref_name_is_valid(refname)) {
> +		yyerror("invalid reference name: %s", refname);
> +		return 0;
> +	}
> +
> +	return 1;
> +}
> [...]
> +static int
> +conf_protect_branch(struct gotd_repo *repo, char *branchname)
> +{
> +	const struct got_error *error;
> +	char *refname;
> +
> +	if (strncmp(branchname, "refs/heads/", 11) != 0) {
> +		if (asprintf(&refname, "refs/heads/%s", branchname) == -1) {
> +			yyerror("asprintf: %s", strerror(errno));
> +			return -1;
> +		}
> +	} else {
> +		refname = strdup(branchname);
> +		if (refname == NULL) {
> +			yyerror("strdup: %s", strerror(errno));
> +			return -1;
> +		}
> +	}
> +
> +	if (!refname_is_valid(refname)) {
> +		free(refname);
> +		return -1;
> +	}
> +
> +	error = got_pathlist_insert(NULL, &repo->protected_branches,
> +	    refname, NULL);
> +	if (error) {
> +		yyerror("got_pathlist_insert: %s", error->msg);

nit: we could free(refname) here.

> +		return -1;
> +	}
> +
> +	return 0;
> +}
> +
> [...]
> --- gotd/repo_imsg.c
> +++ gotd/repo_imsg.c
> [...]
> +static const struct got_error *
> +protect_ref_namespace(const char *refname, const char *namespace)
> +{
> +	const struct got_error *err;
> +
> +	err = validate_namespace(namespace);
> +	if (err)
> +		return err;

If I'm reading this right, in this function and in the others variant
below, `namespace' is the namespace provided in the configuration
file.  if so, I guess we could drop these calls since the one done at
configuration parsing time should be enough.  Or do you want to keep
these extra checks around for safety for a while?

> +	if (strncmp(namespace, refname, strlen(namespace)) == 0)
>  		return got_error_fmt(GOT_ERR_REFS_PROTECTED, "%s", namespace);
>  
>  	return NULL;
>  }
> [...]
> +static const struct got_error *
> +protect_require_yca(struct got_object_id *tip_id,
> +    size_t max_commits_to_traverse, struct got_pack *pack,
> +    struct got_packidx *packidx, struct got_reference *ref)
> +{
> [...]
> +		parent_ids = got_object_commit_get_parent_ids(commit);
> +		STAILQ_FOREACH(pid, parent_ids, entry) {
> +			err = check_cancelled(NULL);
> +			if (err)
> +				goto done;

just for curiosity, this loop doesn't seem heavy, so why
check_cancelled() here?  (asking to figure out if I got
check_cancelled() usage correct in my head.)