"GOT", but the "O" is a cute, smiling pufferfish. Index | Thread | Search

From:
Stefan Sperling <stsp@stsp.name>
Subject:
Re: reuse deltas while packing
To:
Christian Weisgerber <naddy@mips.inka.de>
Cc:
gameoftrees@openbsd.org
Date:
Mon, 7 Feb 2022 15:32:10 +0100

Download raw body.

Thread
On Mon, Feb 07, 2022 at 12:49:56PM +0100, Christian Weisgerber wrote:
> Christian Weisgerber:
> 
> > I cloned a pristine copy of FreeBSD src.git, and I'm currently
> > exploding the pack file with "git unpack-objects" into a few million
> > loose objects.  That should give me something to test.
> 
> Sorry, I didn't get as far as testing the diff, since the comparison
> run with the old code already suffered problems:
> 
> * My first attempt ran out of /tmp space.  4.8G was not enough.
> 
> * Related to that: With more /tmp space, "gotadmin pack" eventually
>   wrote a 4.7G pack file.  The original git clone of the repository
>   has only a 1.4G pack file.
> 
> * got-index-pack hung.

This patch fixes got-index-pack getting stuck.

Unfortunately, zlib uses an unsigned int to represent the number of
input bytes available in its input buffer, instead of a size_t.

Large mapped files can cause truncation during our assignment from
size_t to unsigned int when we update zlib's input length counter.
This results in z->avail_in being set to zero, even though plenty
of bytes are available for zlib to consume. We then loop forever
asking zlib to decompress another zero bytes during every iteration.

In gdb, the situation looks like this:

(gdb) n
252                             z->avail_in = len - *consumed;
(gdb) p len
$29 = 4294967296
(gdb) p *consumed
$30 = 0
(gdb) p z->avail_in
$31 = 0

Huh ?!? :-O

(gdb) p len - *consumed
$33 = 4294967296
(gdb) p /x len
$36 = 0x100000000
(gdb) p /x (unsigned int)len
$37 = 0x0
(gdb) p /x (unsigned long)len
$38 = 0x100000000

I have also spotted wrong updates of zb.outlen after growing the
output buffer. This is unrelated to the problem at hand but should
still be fixed. I would create a separate commit for those changes.

ok?


diff 0cc5a39811be39e9f2b47098ec3a38e8817ba0bc /home/stsp/src/got
blob - 6a151f9c54a8c0f4fc4fea9a4c91ecc7dec71aed
file + lib/deflate.c
--- lib/deflate.c
+++ lib/deflate.c
@@ -153,7 +153,10 @@ got_deflate_read_mmap(struct got_deflate_buf *zb, uint
 		size_t last_total_in = z->total_in;
 		if (z->avail_in == 0) {
 			z->next_in = map + offset + *consumed;
-			z->avail_in = len - *consumed;
+			if (len < (size_t)UINT_MAX)
+				z->avail_in = len - *consumed;
+			else
+				z->avail_in = UINT_MAX - *consumed;
 			if (z->avail_in == 0) {
 				/* EOF */
 				ret = deflate(z, Z_FINISH);
blob - b0426141094227c79e5dd1d6e3cd0e5c2a0b28f0
file + lib/inflate.c
--- lib/inflate.c
+++ lib/inflate.c
@@ -249,7 +249,10 @@ got_inflate_read_mmap(struct got_inflate_buf *zb, uint
 				break;
 			}
 			z->next_in = map + offset + *consumed;
-			z->avail_in = len - *consumed;
+			if (len < (size_t)UINT_MAX)
+				z->avail_in = len - *consumed;
+			else
+				z->avail_in = UINT_MAX - *consumed;
 		}
 		if (zb->csum) {
 			csum_in = z->next_in;
@@ -322,7 +325,6 @@ got_inflate_to_mem(uint8_t **outbuf, size_t *outlen,
 		if (zb.flags & GOT_INFLATE_F_HAVE_MORE) {
 			if (outbuf == NULL)
 				continue;
-			zb.outlen = (nbuf * GOT_INFLATE_BUFSIZE) - *outlen;
 			newbuf = reallocarray(*outbuf, ++nbuf,
 			    GOT_INFLATE_BUFSIZE);
 			if (newbuf == NULL) {
@@ -334,6 +336,7 @@ got_inflate_to_mem(uint8_t **outbuf, size_t *outlen,
 			}
 			*outbuf = newbuf;
 			zb.outbuf = newbuf + *outlen;
+			zb.outlen = (nbuf * GOT_INFLATE_BUFSIZE) - *outlen;
 		}
 	} while (zb.flags & GOT_INFLATE_F_HAVE_MORE);
 
@@ -382,7 +385,6 @@ got_inflate_to_mem_fd(uint8_t **outbuf, size_t *outlen
 		if (zb.flags & GOT_INFLATE_F_HAVE_MORE) {
 			if (outbuf == NULL)
 				continue;
-			zb.outlen = (nbuf * GOT_INFLATE_BUFSIZE) - *outlen;
 			newbuf = reallocarray(*outbuf, ++nbuf,
 			    GOT_INFLATE_BUFSIZE);
 			if (newbuf == NULL) {
@@ -394,6 +396,7 @@ got_inflate_to_mem_fd(uint8_t **outbuf, size_t *outlen
 			}
 			*outbuf = newbuf;
 			zb.outbuf = newbuf + *outlen;
+			zb.outlen = (nbuf * GOT_INFLATE_BUFSIZE) - *outlen;
 		}
 	} while (zb.flags & GOT_INFLATE_F_HAVE_MORE);