Download raw body.
optimise reading the file index
Mark Jamsek:
> The main change is that we memory map the file index to avoid lots of
> small reads that can be costly in large repos, for example, with many
> files.
I'd think that "got status" in /usr/src is dominated by the 92,300
fstatat() calls...
I tried this (cd /usr/src; got st) on an APU2, which qualifies as
slow for amd64 purposes:
0.92-current, three runs:
0m19.97s real 0m08.82s user 0m11.16s system
0m19.95s real 0m08.56s user 0m10.93s system
0m19.99s real 0m08.74s user 0m10.96s system
+ patch, three runs:
0m19.28s real 0m08.21s user 0m10.96s system
0m19.14s real 0m08.02s user 0m10.76s system
0m19.33s real 0m07.72s user 0m11.42s system
I think we need more measurements to be certain that there is any
effect at all. :->
I still have a code comment:
> +static const struct got_error *
> +mread_fileindex_path(char **path, struct got_hash *ctx, const uint8_t *map,
> + size_t mapsz, size_t *offset)
> +{
> + const uint8_t *p, *nul;
> + size_t len, pad, pathlen;
> +
> + if (mapsz < *offset)
> + return got_error(GOT_ERR_FILEIDX_BAD);
> +
> + p = map + *offset;
> +
> + nul = memchr(p, '\0', mapsz);
> + len = nul - p;
> + pad = 8 - len % 8;
> +
> + pathlen = len + pad;
> +
> + if (mapsz < *offset + pathlen)
> + return got_error(GOT_ERR_FILEIDX_BAD);
If the file index is corrupt, the memchr() can overrun the mapped
area and trigger a segfault. The max length should be something
like map + mapz - p.
--
Christian "naddy" Weisgerber naddy@mips.inka.de
optimise reading the file index