On Sun, Nov 20, 2022 at 10:45:14AM +0100, Omar Polo wrote:
> just scratching an itch; there are plenty of web forges online and I
> don't have an account for every one of them (not that I want to). I
> can use git to clone, but then i'd miss all the niceties :)
> Diff belows adds an initial read-only HTTP/S support for got fetch and
> clone. The code is incomplete, wip, etc... use it at your own risk.
> Sharing just in case somebody wants to play along.
> This is done with a new libexec helper "got-http" (an alternative name
> could be "got-dial-http"?) To minimize the changes needed to the dial
I would prefer got-http-fetch. We're probably never going to implement
send via HTTP. But even if we did, we could add a got-http-send program
and share some code between the two.
> and fetch_pack API I decided to write an helper that behaves like
> ssh(1) as far as got is concerned. Under the hood, it transforms what
> got asks into HTTP requests. Only the "smart" HTTP protocol is
> supported, the "dumb" one not. (as of now at least)
> The "smart" HTTP protocol behaves almost as git over ssh, but needs
> two HTTP requests:
> - a first one to do the "discovery" (see if the remote server is
> "smart") and fetch the refs
> - a POST where we send our have/want line and fetch the packfile
> The dumb one is just a bare git repo served via a web server (could be
> httpd(8)) and needs us to fetch all the objects manually and do the
> resolving by ourselves. To be fair I'm not thrilled at the idea of
> implementing it.
While big forges tend to implement the smart protocol for efficiency,
many self-hosted HTTP server setups use the dumb protocol because
this is much easier to set up on the server side.
This includes git.gameoftrees.org, which uses dumb HTTP only ;)
Feel free to use this server for manual testing.
> got-http is pledged "stdio inet dns" and is not unveiled by default
> unlike the other libexec helpers. It also can't be sandboxed with
> capsicum(4) on FreeBSD and I don't want to go thru the pain of trying
> to sandbox it with landlock on linux (needs to access certs.pem and
> probably more stuff there?)
Don't worry too much about this, just do what is feasible.
We want to push people towards SSH anyway. I hope that big forges will
eventually make anonymous SSH possible, though they probably don't care.
> At the moment it "works." I managed to clone repos from github
> (including ports.git) and from sr.ht. Incremental fetches also seems
> to work, in part at least. There's still some bits of how the server
> replies that I'm not following. For example, here's an excerpt of a
> partial fetch:
> 00000000 30 30 30 38 4e 41 4b 0a 30 30 32 39 02 45 6e 75 |0008NAK.0029.Enu|
> 00000010 6d 65 72 61 74 69 6e 67 20 6f 62 6a 65 63 74 73 |merating objects|
> 00000020 3a 20 31 37 37 33 39 32 34 2c 20 64 6f 6e 65 2e |: 1773924, done.|
> 00000030 0a 30 30 32 36 02 43 6f 75 6e 74 69 6e 67 20 6f |.0026.Counting o|
> 000021a0 30 25 20 28 37 39 34 30 2f 37 39 34 30 29 2c 20 |0% (7940/7940), |
> 000021b0 64 6f 6e 65 2e 0a 30 30 31 30 01 50 41 43 4b 00 |done..0010.PACK.|
> 000021c0 00 00 02 00 1b 11 32 30 30 35 01 64 93 16 78 9c |......2005.d..x.|
> We get a NAK and then side-band info, which seems to confuse
> got-fetch-pack that excepts at least one ACK. (see the XXX below.)
ACK means the server has found a common ancestor commit, usually based on
"have" lines the client has sent. But it could be possible that a server
generates an ACK just to let the client know it can stop sending "have" lines,
e.g. if the server is using the multi_ack capability (which got-fetch-pack
does not support). You should ensure that Git-protocol capability
announcements are exchanged properly between the client and the server.
Git's HTTP protocol is a beast compared to the simpler Git TCP protocol.
I am not sure if your approach of implementing this as an ssh-style helper
will work. I suspect you'll want more control over HTTP protocol specifics,
given that we should support the dumb protocol, and that some http-specific
Git-protocol capabilities exist. The got-http-fetch helper will likely need
to replace got-fetch-pack entirely during http fetches, rather than wrap it.
Otherwise, it will be too difficult to debug and fix issues we will encounter.