"GOT", but the "O" is a cute, smiling pufferfish. Index | Thread | Search

From:
Stefan Sperling <stsp@stsp.name>
Subject:
Re: gotwebd crash
To:
Omar Polo <op@omarpolo.com>
Cc:
gameoftrees@openbsd.org
Date:
Sun, 26 Nov 2023 13:32:46 +0100

Download raw body.

Thread
On Sat, Nov 25, 2023 at 11:16:14PM +0100, Omar Polo wrote:
> On 2023/11/25 21:39:16 +0100, Stefan Sperling <stsp@stsp.name> wrote:
> > It looks like the problem is a call to refresh_packidx_paths() within
> > functions called while looping over repo->pathidx_paths:
> > 
> >  got_object_get_type() -> got_object_open() -> got_object_open_packed() ->
> >  got_repo_search_packidx() -> refresh_packidx_paths()
> > 
> > The patch below should fix it. This is a quick fix just for this code path.
> > Other places looping over this list don't seem to be doing anything dangerous.
> > This could be improved upon later but I think it is good enough for now.
> > 
> > ok?
> 
> ok op@ as a quick measure.

Here is an alternative fix, with two improvements:

We can use the 'objects/pack' directory timestamp as a state flag.

We should retry the entire loop if pack files have changed. Otherwise
we might end up trying to open up a pack file which has been deleted,
or we might miss a new pack file which has been added.
I've set an arbitrary limit on the number of retries to avoid endless
looping caused by constant concurrent modifications.

diff /home/stsp/src/got
commit - 2bde3e78a5bd6619af838df19eec530e23783c0b
path + /home/stsp/src/got
blob - b3b8dfbf8803c64c39fe81209b84b6fa8fd4d79a
file + lib/repository.c
--- lib/repository.c
+++ lib/repository.c
@@ -1769,6 +1769,9 @@ match_packed_object(struct got_object_id **unique_id,
 	const struct got_error *err = NULL;
 	struct got_object_id_queue matched_ids;
 	struct got_pathlist_entry *pe;
+	struct timespec tv;
+	int retries = 0;
+	const int max_retries = 10;
 
 	STAILQ_INIT(&matched_ids);
 
@@ -1776,11 +1779,42 @@ match_packed_object(struct got_object_id **unique_id,
 	if (err)
 		return err;
 
+	/*
+	 * Opening objects while iterating over the pack-index path
+	 * list is racy. If the set of pack files in the repository
+	 * changes during loop iteration, refresh_packidx_paths() will
+	 * be called again, via got_object_get_type(), invalidating
+	 * the packidx_paths list we are iterating over.
+	 * To work around this we keep track of the current modification
+	 * time and retry the entire loop if it changes.
+	 */
+retry:
+	tv.tv_sec = repo->pack_path_mtime.tv_sec;
+	tv.tv_nsec = repo->pack_path_mtime.tv_nsec;
+
 	TAILQ_FOREACH(pe, &repo->packidx_paths, entry) {
-		const char *path_packidx = pe->path;
+		const char *path_packidx;
 		struct got_packidx *packidx;
 		struct got_object_qid *qid;
 
+		/*
+		 * If the modification time of the 'objects/pack' directory
+		 * has changed then 'pe' could now be an invalid pointer.
+		 */
+		if (tv.tv_sec != repo->pack_path_mtime.tv_sec ||
+		    tv.tv_nsec != repo->pack_path_mtime.tv_nsec) {
+			if (++retries > max_retries) {
+				err = got_error_msg(GOT_ERR_TIMEOUT,
+				    "too many concurrent pack file "
+				    "modifications");
+				goto done;
+			}
+			got_object_id_queue_free(&matched_ids);
+			goto retry;
+		}
+
+		path_packidx = pe->path;
+
 		err = got_packidx_open(&packidx, got_repo_get_fd(repo),
 		    path_packidx, 0);
 		if (err)