Download raw body.
Issue with gotd and cloning large repos
Stefan Sperling <stsp@stsp.name> wrote:
> On Fri, Jul 18, 2025 at 09:50:37AM -0700, jrmu@ircnow.org wrote:
> [..]
>
> The problem is that we are reusing a temporary file without seeking
> back to the beginning of the file. As a result, the file gets extended,
> rather than rewritten, and will contain data from multiple objects
> rather than just the object we want to open. Therefore, the file size
> on disk reported by stat(2) does not match what is expected, which gets
> reported as "raw object has unexpected size".
>
> This only triggers on large files because smaller files use a different
> code path which keeps all the data in memory, avoiding temp files entirely.
>
> I haven't been able to trigger this specific problem by writing a regression
> test. Can anyone else manage to do that?
>
> I ended up finding another bug with the regression test I wrote, however.
> More on that soon.
oouch
> M lib/repository.c | 2+ 0-
>
> 1 file changed, 2 insertions(+), 0 deletions(-)
>
> commit - 529f16d393fbab504621b40449967a0a33f4041c
> commit + 5344003c15642e6213a5e80e2d1359a9d1c564bf
> blob - 8d2c4a28815971860618446c5701073d301b1ee6
> blob + 175996e19e6d5ed9c58c09fb0722aef4dab5cd75
> --- lib/repository.c
> +++ lib/repository.c
> @@ -399,6 +399,8 @@ got_repo_temp_fds_get(int *fd, int *idx, struct got_re
> if (repo->tempfiles[i] != -1) {
> if (ftruncate(repo->tempfiles[i], 0L) == -1)
> return got_error_from_errno("ftruncate");
> + if (lseek(repo->tempfiles[i], 0L, SEEK_SET) == -1)
> + return got_error_from_errno("lseek");
> *fd = repo->tempfiles[i];
> *idx = i;
> repo->tempfile_use_mask |= (1 << i);
ok op@
Issue with gotd and cloning large repos