"GOT", but the "O" is a cute, smiling pufferfish. Index | Thread | Search

From:
Stefan Sperling <stsp@stsp.name>
Subject:
Re: introducing gotadmin dump
To:
Omar Polo <op@omarpolo.com>
Cc:
gameoftrees@openbsd.org
Date:
Thu, 6 Jul 2023 16:59:13 +0200

Download raw body.

Thread
On Thu, Jul 06, 2023 at 04:29:43PM +0200, Omar Polo wrote:
> --- gotadmin/gotadmin.1
> +++ gotadmin/gotadmin.1
> @@ -337,7 +337,55 @@ work tree, use the repository path associated with thi
>  .Xr got 1
>  work tree, use the repository path associated with this work tree.
>  .El
> +.It Xo
> +.Cm dump
> +.Op Fl q
> +.Op Fl r Ar repository-path
> +.Op Fl x Ar reference
> +.Op Ar reference ...
> +.Xc
> +Dump the contents of the repository to standard output.
> +.Pp
> +If one or more
> +.Ar reference
> +argumenst is specified, only add objects which are reachable via the specified

Typo: argumenst -> arguments

> +	/* Ignore references in the refs/got/ namespace. */
> +	TAILQ_FOREACH_SAFE(re, &include_refs, entry, new) {
> +		refname = got_ref_get_name(re->ref);
> +		if (strncmp("refs/got/", refname, 9) != 0)
> +			continue;

Should we document this exception in the manual page?

> --- /dev/null
> +++ include/got_dump.h
> @@ -0,0 +1,22 @@
[...]
> +const struct got_error *
> +got_dump(FILE *out, struct got_reflist_head *include_refs,
> +    struct got_reflist_head *exclude_refs, struct got_repository *repo,
> +    got_pack_progress_cb progress_cb, void *progress_arg,
> +    got_cancel_cb cancel_cb, void *cancel_arg);

Could we move this file and the function into the got_repo_ namespace?

We would then have: got_repo_dump.h and got_repo_dump()
I think this would be slightly better because it shows what is being dumped.

> blob - /dev/null
> blob + 48e13ed85bfdc396804b57e89345093843499954 (mode 644)
> --- /dev/null
> +++ lib/dump.c

Regardless, dump.c is a fine name as-is.

> +#include "got_repository_admin.h" /* XXX for pack_progress */

As you pointed out, eventually we'll want a dump_progress callback
which would then hide the pack_progress callback from the caller of
the dump API.

> +struct idvec {
> +	struct got_object_id	**ids;
> +	size_t			  len;
> +	size_t			  cap;
> +};

'cap' sounds like "capability -- but it seems this variable is being
used to keep track of the allocated size of the array?
Could we name it 'nalloc' or 'size' like we do elsewhere?
For example, see struct got_pack_metavac in got_lib_pack_create.h
and struct got_delta_table in got_lib_deltify.h, which do pretty
much the same thing.

> +static const struct got_error *
> +idvec_push(struct idvec *v, struct got_object_id *id)
> +{
> +	size_t	 newcap;
> +	void	*t;
> +
> +	if (v->len == v->cap) {
> +		newcap = v->cap + 8;
> +		t = reallocarray(v->ids, newcap, sizeof(*v->ids));
> +		if (t == NULL)
> +			return got_error_from_errno("reallocarray");
> +		v->ids = t;
> +		v->cap = newcap;
> +	}
> +
> +	v->ids[v->len++] = id;
> +	return NULL;
> +}
> +
> +static void
> +idvec_free(struct idvec *v)
> +{
> +	size_t i;
> +
> +	for (i = 0; i < v->len; ++i)
> +		free(v->ids[i]);
> +	free(v->ids);
> +}

> +	(cd "$testroot/repo" && \
> +	    gotadmin dump -q -x master newbranch >$testroot/r.bundle)
> +	if [ $? -ne 0 ]; then
> +		echo "gotadmin dump failed unexpectedly" >&2
> +		test_done "$testroot" 1
> +		return 1
> +	fi
> +
> +	(cd "$testroot/r" && git checkout -q -b newbranch && \
> +	    git pull -q origin newbranch)

I understand this is how Git uses bundles but tihs look rather obscure.
The above 'git pull' actually reads from the update file $tesroot/r.bundle,
correct?  Could we make this explicit on the git command line somehowm
by passsing r.bundle to git fetch, making make this test easier to follow?
Or is that impossible in Git's UI? If so, please add a comment.

In any case, this is just test code so don't worry about this too much.

> +	if [ $? -ne 0 ]; then
> +		echo "git pull failed unexpectedly" >&2
> +		test_done "$testroot" 1
> +		return 1
> +	fi
> +
> +	# add a fake reference so that `got log' appears the same in
> +	# the cloned repository
> +	(cd "$testroot/repo" && got branch -c newbranch -n origin/newbranch)
> +
> +	(cd "$testroot/repo" && got log >$testroot/repo.log)

Would it make sense to run 'got log -p' to also capture and compare diffs,
just for the sake of more test coverage?