From: Stefan Sperling Subject: Re: Loss of original errors To: Christian Weisgerber Cc: gameoftrees@openbsd.org Date: Wed, 12 Jul 2023 09:47:14 +0200 On Tue, Jul 11, 2023 at 08:22:20PM +0200, Christian Weisgerber wrote: > There are a few places in got(1) where, if an error occurs, a > clean-up action is executed, whose potential error is ignored, and > the original error is returned. > > Or at least that's the intent. In practice, the original error can > be lost because the custom_errors[] ring buffer is too small. > > Specifically, there are three instances in cmd_histedit(): > > error = histedit_...(...); > if (error) { > got_worktree_histedit_abort(...); > print_merge_progress_stats(...); > goto done; > } > > Here got_worktree_histedit_abort() creates a number of ignored > errors that overrun the ring buffer. I first tried increasing the > buffer size, but a mkdir(2) is run on every directory in the worktree, > filling the buffer with ignored EEXIST errors, so that doesn't > scale. > > There is similar-ish code in cmd_integrate() with calls to > got_worktree_integrate_abort(), but that routine probably doesn't > do enough work to cause problems. > > The obvious solution would be to save *error into a variable before > calling the clean-up action, but I haven't found a precedent in the > code, and it can't be a local variable... > > Suggestions? I don't have good suggestions to solve this fundamental issue in the error handling design. The current design avoids allocation on error, and it avoids pointers between error objects to form chains, as they exist in the SVN error system which inspired the error system used by Got. I would really like to get away without extra allocations and pointers. The downside are scaling issues like this. We can try being more careful about the way we are using the system. That will hopefully be good enough. In this particular case we can stop sending EEXIST up the call chain. It is not really an error condition since the directory the caller wants to create already exists. I only found one caller that relies on seeing EEXIST passed up. That bit of code could easily be convered to expect success instead. Does this help? Tests are still passing. diff /home/stsp/src/got commit - ad6dd0bb6c5ebeafaa57204a04330df6658c3861 path + /home/stsp/src/got blob - a43c93bf01b5cb12693933595ad320101db7fc5b file + lib/path.c --- lib/path.c +++ lib/path.c @@ -329,7 +329,7 @@ got_path_mkdir(const char *abspath) goto done; if (mkdir(abspath, GOT_DEFAULT_DIR_MODE) == -1) err = got_error_from_errno2("mkdir", abspath); - } else + } else if (errno != EEXIST) err = got_error_from_errno2("mkdir", abspath); } blob - 20e319355137e3ec594f1a28760412c19ab58905 file + lib/worktree.c --- lib/worktree.c +++ lib/worktree.c @@ -396,21 +396,22 @@ add_dir_on_disk(struct got_worktree *worktree, const c { const struct got_error *err = NULL; char *abspath; + struct stat sb; if (asprintf(&abspath, "%s/%s", worktree->root_path, path) == -1) return got_error_from_errno("asprintf"); err = got_path_mkdir(abspath); - if (err && err->code == GOT_ERR_ERRNO && errno == EEXIST) { - struct stat sb; - err = NULL; - if (lstat(abspath, &sb) == -1) { - err = got_error_from_errno2("lstat", abspath); - } else if (!S_ISDIR(sb.st_mode)) { - /* TODO directory is obstructed; do something */ - err = got_error_path(abspath, GOT_ERR_FILE_OBSTRUCTED); - } + if (err) + goto done; + + if (lstat(abspath, &sb) == -1) { + err = got_error_from_errno2("lstat", abspath); + } else if (!S_ISDIR(sb.st_mode)) { + /* TODO directory is obstructed; do something */ + err = got_error_path(abspath, GOT_ERR_FILE_OBSTRUCTED); } +done: free(abspath); return err; }