From: Stefan Sperling Subject: Re: tog: fix race condition between main and log thread To: Mark Jamsek Cc: gameoftrees@openbsd.org Date: Mon, 30 Dec 2024 16:48:16 +0100 On Mon, Dec 30, 2024 at 11:38:53PM +1100, Mark Jamsek wrote: > This got lost on IRC the other day when I should have sent it to the > list: stsp reported that the tog test_log_mark_keymap regression test > would randomly hang, which I was able to reproduce. > > This is due to a race condition between the main thread checking the > log_complete flag and the log thread setting it. When the F keymap is > entered to toggle fullscreen, the main thread might need more commits > queued to fully populate the log view when redrawing the screen. So it > first checks the log_complete flag to see if the commit graph has > already been traversed and, if so, doesn't bother waking the log thread > to request any more commits. But if the log_complete flag isn't set, it > does wake the log thread and than waits for it to signal when a commit > is loaded. > > In the log thread, after we have finished traversing the commit graph, > we briefly release the mutex before setting the log_complete flag and > this causes the race: the main thread checks and the flag is unset so > it wakes the log thread and then waits on the log thread in > trigger_log_thread() to signal when a new commit is loaded. > In between the main thread checking the flag, the log thread exits > the loop and sets the flag. > > The fix sets the flag in the loop before releasing the mutex. > And because we can exit the loop on error or receipt of a fatal signal, > we also keep the existing log_complete assignment outside the loop. ok, thanks for tracking this down!