From: Mark Jamsek Subject: Re: Poor fetching from gotd To: Christian Weisgerber , gameoftrees@openbsd.org Date: Thu, 9 Mar 2023 01:11:04 +1100 On 23-03-08 02:33PM, Stefan Sperling wrote: > On Wed, Mar 08, 2023 at 07:40:54AM +0100, Stefan Sperling wrote: > > On Mon, Mar 06, 2023 at 11:36:49AM +0100, Christian Weisgerber wrote: > > > Christian Weisgerber: > > > > > > > I have a mirror mode got.git repository sitting here that I switched > > > > from stsp_gitolite@git.gameoftrees.org to anonymous@got.gameoftrees.org > > > > a few days ago: > > > > > > I think that was a red herring. I have another repository... > > > > > > remote "origin" { > > > server stsp_gitolite@git.gameoftrees.org > > > protocol ssh > > > repository "/got-portable.git" > > > fetch_all_branches yes > > > } > > > > > > ... and it's showing the same thing. I noticed that the last few > > > fetches, although not slow, all downloaded suspiciously similar > > > amounts of data: > > > > > > -rw-r--r-- 1 naddy naddy 540647 Mar 3 22:52 pack-9e1a7580959dbe276a0c140fc4 > > > 7a9860866f813f.pack > > > -rw-r--r-- 1 naddy naddy 530179 Mar 4 00:38 pack-1218978dbc2a61745df48c0086 > > > 1a6bd0e5059c6b.pack > > > -rw-r--r-- 1 naddy naddy 505780 Mar 6 00:00 pack-80a1745fc7993cb8e5e9c280c1 > > > cfd2802f395be3.pack > > > -rw-r--r-- 1 naddy naddy 530300 Mar 6 01:50 pack-9fb8e2f88076f36acf3fc4fd0b > > > 16709ddf2de651.pack > > > > > > Comparing "gotadmin ls | cut -d' ' -f1" of those pack files shows > > > a lot of repeated hashes. > > > > That would suggest we have a bad bug in the client-side fetch code. > > But with the reproducer you sent me I can reproduce this problem against > > gotd, yet not against git-daemon/gitolite. > > > > fetch main and 0.85 tag from gotd: > > -rw-r--r-- 1 stsp stsp - 160K Mar 8 07:33 objects/pack/pack-e0eb4f7a0245f0ee2d3e71cbd9786edda115ffd7.idx > > -rw-r--r-- 1 stsp stsp - 3.4M Mar 8 07:33 objects/pack/pack-e0eb4f7a0245f0ee2d3e71cbd9786edda115ffd7.pack > > > > fetch main and 0.85 tag from git-daemon: > > -rw-r--r-- 1 stsp stsp - 5.7K Mar 8 07:36 objects/pack/pack-a8e14828f29e45b40c363ae55fe63ad901f3317b.idx > > -rw-r--r-- 1 stsp stsp - 302K Mar 8 07:36 objects/pack/pack-a8e14828f29e45b40c363ae55fe63ad901f3317b.pack > > The patch below fixes it. The root cause of this problem is in gotsh. > > If a common ancestor commit is found based on the first have-line > the client has sent, gotsh only passes on the first have-line on > to gotd's repo_read process. Which then ignores known commits listed > on all the other have-lines while building the pack file. > > By passing all have-lines we arrive at a reasonble pack file size. > > Clients are supposed to send a 'done' eventually so the protocol exchange > should still work as expected. Moving to state 'done' as soon as a common > ancestor was found was just an "optizmiation" and a bad idea in hindsight. > > server: 5202 commits colored, 169 objects found, deltify 100% > 343K fetched; indexing 100%; resolving deltas 100% > Fetched b6f6da5d12372b1479121e68b0b7f138db7babe0.pack > Updated refs/heads/main: cee3836880940ba0d1203e0682a43a680132bc27 > Created reference refs/tags/0.85: 6929bad98706048c725b9dbcd19c0ef7989c349c > $ ls -l objects/pack/pack-b6f6da5d12372b1479121e68b0b7f138db7babe0.pack > -rw-r--r-- 1 stsp stsp - 343K Mar 8 14:25 objects/pack/pack-b6f6da5d12372b1479121e68b0b7f138db7babe0.pack > $ > > 'git fetch' is likewise still happy with this change in place. > > ok? Nice! ok > diff /home/stsp/src/got > commit - cee3836880940ba0d1203e0682a43a680132bc27 > path + /home/stsp/src/got > blob - 78941b351a05889c4da728c53e1bc922f5d5123e > file + lib/serve.c > --- lib/serve.c > +++ lib/serve.c > @@ -964,8 +964,6 @@ serve_read(int infd, int outfd, int gotd_sock, const c > if (err) > goto done; > seen_have = 1; > - if (have_ack) > - curstate = STATE_EXPECT_DONE; > } > } else if (n == 5 && strncmp(buf, "done\n", 5) == 0) { > if (curstate != STATE_EXPECT_HAVE && > -- Mark Jamsek GPG: F2FF 13DE 6A06 C471 CA80 E6E2 2930 DC66 86EE CF68