"GOT", but the "O" is a cute, smiling pufferfish. Index | Thread | Search

From:
Stefan Sperling <stsp@stsp.name>
Subject:
Re: got patch vs unchanged files
To:
Omar Polo <op@omarpolo.com>
Cc:
gameoftrees@openbsd.org
Date:
Sat, 19 Mar 2022 11:50:17 +0100

Download raw body.

Thread
On Sat, Mar 19, 2022 at 11:11:13AM +0100, Omar Polo wrote:
> Stefan Sperling <stsp@stsp.name> wrote:
> > On Fri, Mar 18, 2022 at 06:47:06PM +0100, Omar Polo wrote:
> > > While testing a diff for an unrelated thing I noticed that `got patch'
> > > fails to handle the case of a patch that tries to add a file already
> > > versioned.
> > > 
> > > Long story short, I'm using got_worktree_status to check the status of a
> > > file before applying the patch.  got_worktree_status calls
> > > worktree_status with report_unchanged=0, so the `can_add' callback in
> > > lib/patch.c doesn't get called and check_file_status assumes that it's
> > > possible to add the file.
> > > 
> > > Patch below adds another argument to got_worktree_status so the caller
> > > can decide to have also unchanged files reported to the callback.  If we
> > > don't want to change got_worktree_status I have another idea to solve
> > > this issue locally in patch.c, but I thought this was the easiest way.
> > 
> > Alternatively, you could move worktree-specific functionality from patch.c
> > into worktree.c. Then you would get direct access to the static function
> > worktree_status() from your new worktree.c function which patch.c would call.
> > 
> > FWIW, this is also how other commands like rebase and histedit check
> > their work tree preconditions.
> 
> sure, that's better indeed.
> 
> i also took it as a chance to refactor a couple of things, I should have
> added some stuff into worktree.c from the start.

Great, thanks! Looks good to me.

One thing that is not very obvious, and can certainly be fixed later:

> +const struct got_error *
> +got_worktree_patch_prepare(const char *old, const char *new,
> +    char **oldpath, char **newpath, struct got_worktree *worktree,
> +    struct got_repository *repo)
> +{
> +	const struct got_error *err = NULL;
> +	struct got_fileindex *fileindex = NULL;
> +	char *fileindex_path = NULL;
> +	int file_renamed = 0;
> +	unsigned char status_old, staged_status_old;
> +	unsigned char status_new, staged_status_new;
> +
> +	*oldpath = NULL;
> +	*newpath = NULL;
> +
> +	err = open_fileindex(&fileindex, &fileindex_path, worktree);

Opening the file index is relatively expensive on large work trees.
This is why rebase, histedit, etc. return the fileindex to the caller
as an opaque pointer for re-use. It can speed things up a lot.

In your case, this would require splitting this function in two,
one for initialization, and one which performs the per-path checks.
You would also need a function that cleans up (ie. closes the fileindex),
called at the very end of the pack operation.
Restructuring the code in this way should improve performance with large
patches.

And if 'got patch' would ever want to track state which can be expressed
as references in refs/got/, such a design would provide the places to
create and clean up such references.