Download raw body.
Issue with gotd and cloning large repos
Stefan Sperling <stsp@stsp.name> wrote:
> On Fri, Jul 18, 2025 at 09:50:37AM -0700, jrmu@ircnow.org wrote:
> [..]
> 
> The problem is that we are reusing a temporary file without seeking
> back to the beginning of the file. As a result, the file gets extended,
> rather than rewritten, and will contain data from multiple objects
> rather than just the object we want to open. Therefore, the file size
> on disk reported by stat(2) does not match what is expected, which gets
> reported as "raw object has unexpected size".
> 
> This only triggers on large files because smaller files use a different
> code path which keeps all the data in memory, avoiding temp files entirely.
> 
> I haven't been able to trigger this specific problem by writing a regression
> test. Can anyone else manage to do that?
> 
> I ended up finding another bug with the regression test I wrote, however.
> More on that soon.
oouch
> M  lib/repository.c  |  2+  0-
> 
> 1 file changed, 2 insertions(+), 0 deletions(-)
> 
> commit - 529f16d393fbab504621b40449967a0a33f4041c
> commit + 5344003c15642e6213a5e80e2d1359a9d1c564bf
> blob - 8d2c4a28815971860618446c5701073d301b1ee6
> blob + 175996e19e6d5ed9c58c09fb0722aef4dab5cd75
> --- lib/repository.c
> +++ lib/repository.c
> @@ -399,6 +399,8 @@ got_repo_temp_fds_get(int *fd, int *idx, struct got_re
>  		if (repo->tempfiles[i] != -1) {
>  			if (ftruncate(repo->tempfiles[i], 0L) == -1)
>  				return got_error_from_errno("ftruncate");
> +			if (lseek(repo->tempfiles[i], 0L, SEEK_SET) == -1)
> +				return got_error_from_errno("lseek");
>  			*fd = repo->tempfiles[i];
>  			*idx = i;
>  			repo->tempfile_use_mask |= (1 << i);
ok op@
Issue with gotd and cloning large repos