From: Christian Weisgerber Subject: Re: deltify addblk() fixes To: gameoftrees@openbsd.org Date: Wed, 9 Jun 2021 17:15:35 +0200 Stefan Sperling: > Fix deltify addblk() list size confusion: We only have 'old_size' number > of blocks to rehash after growing the array. That fix is correct. > And only extend the array slightly when we outgrow it, instead of doubling > the array's size each time Isn't the underlying idea that the array is somewhat sparsely populated to that h % dt->nalloc is likely to hit an empty slot on insert and the right slot on lookup? Anyway, this doesn't look right: > dt->nblocks++; > - if (dt->nalloc < 2 * dt->nblocks) { > + if (dt->nalloc < dt->nblocks + 64) { > struct got_delta_block *db; > - nalloc = dt->nalloc * 2; > + size_t old_size = dt->nalloc; > db = dt->blocks; > - dt->blocks = calloc(nalloc, sizeof(struct got_delta_block)); > + dt->blocks = calloc(dt->nblocks + 64, sizeof(struct got_delta_block)); > if (dt->blocks == NULL) { > err = got_error_from_errno("calloc"); > dt->blocks = db; > return err; > } > - dt->nalloc = nalloc; > + dt->nalloc = dt->nblocks + 64; Assume that dt->nalloc is 100. Then as soon as dt->nblock hits 37, the if triggers (100 < 101). We then allocate 101 blocks... Therefore, each addblk() call now extends the array by 1 block. I think you meant calloc(dt->nalloc + 64, ...) dt->nalloc += 64 While we're here, I'm also puzzled by this part: * Avoid adding duplicate blocks. * NB: A matching hash is insufficient for detecting equality. * The hash can only detect inequality, so only check 'len'. */ if (len == dt->blocks[i].len) { ... if (memcmp(buf, buf2, len) == 0) return NULL; } If the hash can only detect inequality, shouldn't we still check it if (len == dt->blocks[i].len && h == dt->blocks[i].hash) { to skip expensive compares? -- Christian "naddy" Weisgerber naddy@mips.inka.de