"GOT", but the "O" is a cute, smiling pufferfish. Index | Thread | Search

From:
Stefan Sperling <stsp@stsp.name>
Subject:
Re: got remove recursion
To:
Tracey Emery <tracey@traceyemery.net>
Cc:
gameoftrees@openbsd.org
Date:
Thu, 12 Dec 2019 19:02:08 +0100

Download raw body.

Thread
On Wed, Dec 11, 2019 at 03:36:15PM -0700, Tracey Emery wrote:
> Hello,
> 
> The diff below adds recursion to got remove. New test added and all
> tests pass. TODO and man page updated. Small fix for incorrect usage.
> 
> I'm not really keen on the ignore_nochange arg name. If someone has a
> better idea, let me know.

Maybe 'report_unchanged'?

> Comments? Ok?

There's a bit of an oddity in your patch which gets exposed by the regress
test you've added.

Your approach of adding this functionality seems to rely on non-obvious
side-effects. Which leaves me worried that this approach may be too brittle
in the long term.

At the top-level you're passing the work tree's root dir as relative
path "" to worktree_status (this seems fine):

(gdb) list
2995                    char *ondisk_path;
2996                    if (asprintf(&ondisk_path, "%s/%s", worktree->root_path,
2997                        pe->path) == -1)
2998                            return got_error_from_errno("asprintf");
2999                    sda.ondisk_path = ondisk_path;
3000                    err = worktree_status(worktree, pe->path, fileindex, repo,
3001                            schedule_for_deletion, &sda, NULL, NULL, 0, 1);
3002                    free(ondisk_path);
3003                    if (err)
3004                            break;
(gdb) p pe->path
$44 = 0xb97cb4b0ba0 ""
(gdb) p ondisk_path
$45 = 0xb982ec7f280 "/tmp/got-test-rm_directory-VOqpla1g/wt/"
(gdb)

Now ondisk_path is the absolute version of pe->path, and this absolute
path gets passed down via sda.ondisk_path.

Now, later on, schedule_for_deletion() is called for each file found
while traversing this directory. Then two get_file_status() calls happen;
one with the work tree's root path (ondisk_path) and another
get_file_status() call for the file which is about to be deleted:

#0  schedule_for_deletion (arg=0x7f7ffffcd818, status=126 '~',
    staged_status=32 ' ', relpath=0xb97cb4b0370 "alpha",
    blob_id=0x7f7ffffcd3f0, staged_blob_id=0x0, commit_id=0x7f7ffffcd3d8)
    at /home/stsp/src/got/got/../lib/worktree.c:2935
2935                        got_worktree_get_root_path(a->worktree),
(gdb) list 2910,2945
2910        struct got_object_id *commit_id)
2911    {
2912            struct schedule_deletion_args *a = arg;
2913            const struct got_error *err = NULL;
2914            struct got_fileindex_entry *ie = NULL;
2915            struct stat sb;
2916            char *path = NULL;
2917
2918            ie = got_fileindex_entry_get(a->fileindex, relpath, strlen(relpath));
2919            if (ie == NULL)
2920                    return got_error(GOT_ERR_BAD_PATH);
2921
2922            staged_status = get_staged_status(ie);
2923            if (staged_status != GOT_STATUS_NO_CHANGE) {
2924                    if (staged_status == GOT_STATUS_DELETE)
2925                            return NULL;
2926                    return got_error_path(relpath, GOT_ERR_FILE_STAGED);
2927            }
2928
2929            err = get_file_status(&status, &sb, ie, a->ondisk_path, a->repo);
2930            if (err)
2931                    return err;
2932
2933            if (S_ISDIR(sb.st_mode)) {
2934                    if (asprintf(&path, "%s/%s",
2935                        got_worktree_get_root_path(a->worktree),
2936                        relpath) == -1) {
2937                            err = got_error_from_errno("asprintf");
2938                            return err;
2939                    }
2940
2941                    err = get_file_status(&status, &sb, ie, path, a->repo);
2942                    if (err)
2943                            return err;
2944            } else
2945                    path = strdup(a->ondisk_path);
(gdb) p relpath
$40 = 0xb97cb4b0370 "alpha"
(gdb) p a->ondisk_path
$41 = 0xb982ec7f280 "/tmp/got-test-rm_directory-VOqpla1g/wt/"
(gdb) p a->worktree->root_path
$42 = 0xb984434e7c0 "/tmp/got-test-rm_directory-VOqpla1g/wt"
(gdb) p status
$43 = 126 '~'

Now here come the inconsistencies:

Note that status is '~' which means "obstructed"; generally this is used
to represent a situation where a regular file was expected at a given path
but some other thing was found, such as a directory, /dev device node, or
named socket. This value of status is not used. At some point, someone will
be looking at this and wonder what is going on.

It seems wrong to obtain a file index entry for a relpath "alpha", and
then call get_file_status() with this same file index entry and an
on-disk path that does not point at the corresponding file on disk.
In the above case, the on-disk path points at the work tree's root
directory instead of the ie's file's path. This is why status becomes '~'.

Technically, get_file_status() should not even be called with a path
that does not correspond to a file index entry (which by definition
contains files only, not directories).

We should try to somehow improve the clarity of this code but I don't
have a good suggestion right now.