From: Stefan Sperling Subject: Re: gotwebd crash To: Omar Polo Cc: gameoftrees@openbsd.org Date: Sun, 26 Nov 2023 13:32:46 +0100 On Sat, Nov 25, 2023 at 11:16:14PM +0100, Omar Polo wrote: > On 2023/11/25 21:39:16 +0100, Stefan Sperling wrote: > > It looks like the problem is a call to refresh_packidx_paths() within > > functions called while looping over repo->pathidx_paths: > > > > got_object_get_type() -> got_object_open() -> got_object_open_packed() -> > > got_repo_search_packidx() -> refresh_packidx_paths() > > > > The patch below should fix it. This is a quick fix just for this code path. > > Other places looping over this list don't seem to be doing anything dangerous. > > This could be improved upon later but I think it is good enough for now. > > > > ok? > > ok op@ as a quick measure. Here is an alternative fix, with two improvements: We can use the 'objects/pack' directory timestamp as a state flag. We should retry the entire loop if pack files have changed. Otherwise we might end up trying to open up a pack file which has been deleted, or we might miss a new pack file which has been added. I've set an arbitrary limit on the number of retries to avoid endless looping caused by constant concurrent modifications. diff /home/stsp/src/got commit - 2bde3e78a5bd6619af838df19eec530e23783c0b path + /home/stsp/src/got blob - b3b8dfbf8803c64c39fe81209b84b6fa8fd4d79a file + lib/repository.c --- lib/repository.c +++ lib/repository.c @@ -1769,6 +1769,9 @@ match_packed_object(struct got_object_id **unique_id, const struct got_error *err = NULL; struct got_object_id_queue matched_ids; struct got_pathlist_entry *pe; + struct timespec tv; + int retries = 0; + const int max_retries = 10; STAILQ_INIT(&matched_ids); @@ -1776,11 +1779,42 @@ match_packed_object(struct got_object_id **unique_id, if (err) return err; + /* + * Opening objects while iterating over the pack-index path + * list is racy. If the set of pack files in the repository + * changes during loop iteration, refresh_packidx_paths() will + * be called again, via got_object_get_type(), invalidating + * the packidx_paths list we are iterating over. + * To work around this we keep track of the current modification + * time and retry the entire loop if it changes. + */ +retry: + tv.tv_sec = repo->pack_path_mtime.tv_sec; + tv.tv_nsec = repo->pack_path_mtime.tv_nsec; + TAILQ_FOREACH(pe, &repo->packidx_paths, entry) { - const char *path_packidx = pe->path; + const char *path_packidx; struct got_packidx *packidx; struct got_object_qid *qid; + /* + * If the modification time of the 'objects/pack' directory + * has changed then 'pe' could now be an invalid pointer. + */ + if (tv.tv_sec != repo->pack_path_mtime.tv_sec || + tv.tv_nsec != repo->pack_path_mtime.tv_nsec) { + if (++retries > max_retries) { + err = got_error_msg(GOT_ERR_TIMEOUT, + "too many concurrent pack file " + "modifications"); + goto done; + } + got_object_id_queue_free(&matched_ids); + goto retry; + } + + path_packidx = pe->path; + err = got_packidx_open(&packidx, got_repo_get_fd(repo), path_packidx, 0); if (err)