"GOT", but the "O" is a cute, smiling pufferfish. Index | Thread | Search

From:
Stefan Sperling <stsp@stsp.name>
Subject:
Re: first draft of gotadmin load
To:
Omar Polo <op@omarpolo.com>
Cc:
gameoftrees@openbsd.org
Date:
Sun, 9 Jul 2023 14:41:17 +0200

Download raw body.

Thread
On Sun, Jul 09, 2023 at 01:55:57PM +0200, Omar Polo wrote:
> here's a first draft of the load command.  it is intended to be the
> counterpart of `gotadmin dump' and, just like dump, it's intended to
> handle fast-export stream in the future but at the moment only handles
> git bundles.
> 
> there's some code lifted from got.c and fetch.c since a git bundle is
> just a packfile with a plaintext header, so `gotadmin load' is very
> similar to fetch.

Please put this in with my ok. I will be easier to collaborate on future
changes in-tree. I have read the diff and did not see any blocking issues.

> There are still a few questions open in this initial design:
> 
>  - is fine for the default behaviour to load all the refs?  should I
>    add -a instead to load everything and make -a or -b mandatory?

Not sure. It will depend on the use case I suppose.

>  - I'd like to verify the pack file more.  I can for instance image a
>    case where the header of the pack advertises an object that's not
>    available in the pack.  We should catch this case.

You could check the existence of the loaded commits via the pack index,
before installing the pack file, and error out in case of such problems.

>  - It should only fast-forward the references by default?  at the
>    moment it happily overwrites the (selected) references with what
>    the dump advertizes.

In cases where the bundle is used as an offline replacement for
got fetch users will want to be able to update existing references.

In case of restoring a backup from a bundle the objects might be from
an older state so existing refs should better be left untouched.

Having the ability to select specific refs via -b is good but this
is probably not flexible enough for some use cases.
I wonder if we should add an option to load references into a new
reference namespace. This would allow loading the data without
touching any existing refs.
 
>  - I'm doing the parsing of the bundle header in the main process.
>    This is usually a no-go, but the header is very simple and adding a
>    libexec only for it felt overkill.  Will write one however if we
>    want to stick to 'no parsing in the main process'.

I think that is fine. The header is very simple.

> I'm happy to address these and other concers before committing, but
> since it's quite long already working on it in-tree is also an option.

I would prefer in-tree as noted above.

If you don't mind I will try to reword the docs a bit once the diff has
landed on the main branch.
Would you mind if we referred to the data being loaded as a "bundle"
instead of a "dump"?  This makes it easier for people to know what
format is being used. Once we add support for Git's fast import/export
streams we could revisit the terminology again. But we should be very
clear about the format because people reading the docs will need to
know what we are using, regarding interop with Git and other tools.