"GOT", but the "O" is a cute, smiling pufferfish. Index | Thread | Search

From:
Stefan Sperling <stsp@stsp.name>
Subject:
Re: patch: regress: Work around glitchy clocks in vmm guests
To:
Steven McDonald <steven@steven-mcdonald.id.au>
Cc:
gameoftrees@openbsd.org
Date:
Sun, 29 Sep 2019 11:53:18 +0200

Download raw body.

Thread
On Sun, Sep 29, 2019 at 12:33:13AM +0200, Steven McDonald wrote:
> Some tests fail intermittently in vmm because the clock does not
> advance between steps, resulting in files that should have a newer
> mtime than the repo metadata but don't. Example failure:
> 
>   test_status_shows_local_mods_after_update --- /tmp/got-test-status_shows_local_mods_after_update-duAj87aL/stdout.expected       Sun Sep 29 00:18:36 2019
>   +++ /tmp/got-test-status_shows_local_mods_after_update-duAj87aL/stdout  Sun Sep 29 00:18:36 2019
>   @@ -1 +0,0 @@
>   -M  numbers
>   test failed; leaving test data in /tmp/got-test-status_shows_local_mods_after_update-duAj87aL
> 
> Comparing the mtime on the 'numbers' file with that in .got/file-index
> shows that the timestamps are identical, which is why got does not
> treat this file as modified.
> 
> This intermittent behaviour can be observed outside the context of got
> (here, x is a small program that just calls stat(2) on foo and displays
> the mtime):
> 
>   $ echo a >foo; ./x; echo b >foo; ./x 
>   1569708831.911759175
>   1569708831.931539267
>   $ echo a >foo; ./x; echo b >foo; ./x 
>   1569708832.741842903
>   1569708832.741842903
> 
> This is unlikely to be a problem in real-world usage because humans
> aren't as fast as these regression tests, so I think working around it
> in the tests is sufficient.
> 
> The below diff makes this work consistently; sleeping for a nanosecond
> is sufficient because we just need the clock to advance by any amount
> at all.
> 
> Better suggestions/OKs?

Subversion handles this race in C code by sleeping for a short amount
of time to ensure that files modified after it exits have a different
time stamp from the one which was recorded in meta data.

Docs: https://svn.apache.org/viewvc/subversion/tags/1.10.6/subversion/include/svn_io.h?revision=1821224&view=markup#l580

Implementation: https://svn.apache.org/viewvc/subversion/tags/1.10.6/subversion/libsvn_subr/io.c?revision=1865520&view=markup#l1321

Note that SVN has to worry about filesystems without sub-second timestamp
resolution, e.g. VFAT which has a modified timestamp resolution of 2 seconds.
We can probably assume nanosecond timestamps on OpenBSD? Note the comments
in SVN's code regarding Linux ext3/4, though. Apparently 10ms would be safe?

SVN disables sleep for timestamps in its regress test suite so that tests can
be run on VFAT in a reasonable amount of time. This means SVN's tests have to
always change the size of modified files to avoid the timestamp race, which
test authors (myself included) occasionally forget to do...

Writing reliable tests should be an easy thing to do, and your suggested
approach does not make it any easier. Also consider that the scope of this
problem is wider than our regress tests. Anyone writing shell scripts with
got will eventually hit this issue and would end up debugging it the hard way.

There's already a test which ensures that file size is changed to work around
spurious test failures, see below. I did this as a quick fix but should have
written a proper fix instead. This hack could be removed if a better solution
were implemented in C.

A good place to sleep might be at the bottom of worktree.c:sync_fileindex().
Do you want to try this?

FWIW, the problem fixed below was seen on bare iron, not in a VM.

-----------------------------------------------
commit 6c6b73bb412843c96a7dca913c41a91827d9018c
from: Stefan Sperling <stsp@stsp.name>
date: Sat Aug 10 15:24:59 2019 UTC
 
 fix race condition in test_revert_patch_one_change
 
diff ede14c87a98ea1c97ce15353442b532ac77a8770 9acaee36224e921ff553189e5a437768de635992
blob - d66beb7f0760252ff97e00a9b4415b40cf50049b
blob + 170bc3b29f399c15ed51b80c19922e0d5c6ba58b
--- regress/cmdline/revert.sh
+++ regress/cmdline/revert.sh
@@ -809,7 +809,10 @@ function test_revert_patch_one_change {
 		return 1
 	fi
 
-	sed -i -e 's/^2$/a/' $testroot/wt/numbers
+	# Ensure file size is changed. Avoids race condition causing test
+	# failures where 'got revert' does not see changes to revert if
+	# timestamps and size in stat info remain unchanged.
+	sed -i -e 's/^2$/aa/' $testroot/wt/numbers
 
 	# revert change with -p
 	printf "y\n" > $testroot/patchscript
@@ -826,7 +829,7 @@ function test_revert_patch_one_change {
 @@ -1,5 +1,5 @@
  1
 -2
-+a
++aa
  3
  4
  5
@@ -834,6 +837,13 @@ function test_revert_patch_one_change {
 M  numbers (change 1 of 1)
 revert this change? [y/n/q] y
 EOF
+	ret="$?"
+	if [ "$ret" != "0" ]; then
+		echo "got revert command failed unexpectedly" >&2
+		test_done "$testroot" "1"
+		return 1
+	fi
+
 	cmp -s $testroot/stdout.expected $testroot/stdout
 	ret="$?"
 	if [ "$ret" != "0" ]; then