"GOT", but the "O" is a cute, smiling pufferfish. Index | Thread | Search

From:
Mark Jamsek <mark@jamsek.com>
Subject:
Re: optimise reading the file index
To:
Stefan Sperling <stsp@stsp.name>
Cc:
gameoftrees@openbsd.org
Date:
Fri, 28 Jul 2023 00:09:53 +1000

Download raw body.

Thread
  • Mark Jamsek:

    optimise reading the file index

  • Christian Weisgerber:

    optimise reading the file index

  • Stefan Sperling <stsp@stsp.name> wrote:
    > On Thu, Jul 27, 2023 at 06:04:21PM +1000, Mark Jamsek wrote:
    > > This diff is borne from a discussion with stsp on irc, and attempts to
    > > make reading the file index faster, which should improve many operations.
    > > 
    > > The main change is that we memory map the file index to avoid lots of
    > > small reads that can be costly in large repos, for example, with many
    > > files. And we also reduce multiple small allocations down to one
    > > allocation for the entire padded path length when building each index
    > > entry. If memory mapping the file fails, however, we fall back to the
    > > file io status quo.
    > > 
    > > Regress is still happy, but I haven't properly profiled this yet. Some
    > > trivial measurements show a marked improvement, and it appears to be
    > > perceptibly faster; however, my machine is already quite fast even with
    > > 'got status' in src.git so I'm looking for wider testing to see if this
    > > is indeed worth it. In theory it should be, but I'd rather have some
    > > good measurements.
    > 
    > I am not opposed to using mmap to read the file index if that is indeed
    > the fastest approach. However, keep in mind that OpenBSD has no unified
    > buffer cache, which means we cannot mix mmap with regular read/write
    > operations on the same file. Probably not an issue in this case since
    > the file index is read completely before being modified and then written
    > out again. But this also suggests that mmap might not be the best approach.
    > 
    > mmap is most useful for us when running a series of random access reads,
    > e.g. while reading pack index and pack files. If we compare the performance
    > benefits of your diff to an operation which reads pack files a lot, such
    > as gotadmin pack -a or gotadmin indexpack, between a -DGOT_PACK_NO_MMAP build
    > and a regular build of Got, I would expect the relative difference that mmap
    > makes in your proposed diff to be smaller (in theory; I haven't actually tried
    > this for lack of time).
    > 
    > The file index is not very large. Currently about 12MB for src, 8MB for ports.
    > Even on memory-constrained machines we could probably read the whole file into
    > memory in one or a few read() calls and parse the data in a temporary malloc
    > buffer. I suspect this would yield higher speed than small reads via mmap.
    > mmap is probably faster than the current code only because with mmap we are
    > asking the kernel to read page-sized blocks from the file, instead of tiny
    > amounts like sizeof(uint32_t), sizeof(uint64_t) sized-chunks. (Not sure what
    > the minimun size of a read in the kernel is, probably a 512 bytes sector?
    > Could be less than a page.)
    > In any case, the main overhead of the current code should be the amount of
    > read syscalls it triggers, with the buffer cache compensating somewhat for
    > our tiny read requests.
    
    After looking at the size of the index in src.git, that was my first
    instinct too, but then I thought there may well be got users with much
    larger indexes. Assuming there is, though, I think it's safe to assume
    that they still wouldn't be "that" big? So, sure, it should be fine to
    allocate the entire file. It's a trivial change in any case.
    
    Interestingly, in my profiling it is about the same as mmap, though
    still faster than the status quo.
    
    
    Mark Jamsek:
    >
    >> The main change is that we memory map the file index to avoid lots of
    >> small reads that can be costly in large repos, for example, with many
    >> files.
    >
    >I'd think that "got status" in /usr/src is dominated by the 92,300
    >fstatat() calls...
    >
    >I tried this (cd /usr/src; got st) on an APU2, which qualifies as
    >slow for amd64 purposes:
    >
    >0.92-current, three runs:
    >    0m19.97s real     0m08.82s user     0m11.16s system
    >    0m19.95s real     0m08.56s user     0m10.93s system
    >    0m19.99s real     0m08.74s user     0m10.96s system
    >
    >+ patch, three runs:
    >    0m19.28s real     0m08.21s user     0m10.96s system
    >    0m19.14s real     0m08.02s user     0m10.76s system
    >    0m19.33s real     0m07.72s user     0m11.42s system
    >
    >I think we need more measurements to be certain that there is any
    >effect at all. :->
    
    heh! :D
    
    I'm showing an average of 30-40% faster with an -02 release build when
    compared to main; op is seeing about the same as me both on ssd and
    spinning rust. It _should_ be a little faster but how much, I guess it
    depends?
    
    It would be interesting if the below diff is the same on your APU2 as
    the memory mapped file.
    
    >
    >I still have a code comment:
    >
    >> +static const struct got_error *
    >> +mread_fileindex_path(char **path, struct got_hash *ctx, const uint8_t *map,
    >> +    size_t mapsz, size_t *offset)
    >> +{
    >> +	const uint8_t	*p, *nul;
    >> +	size_t		 len, pad, pathlen;
    >> +
    >> +	if (mapsz < *offset)
    >> +		return got_error(GOT_ERR_FILEIDX_BAD);
    >> +
    >> +	p = map + *offset;
    >> +
    >> +	nul = memchr(p, '\0', mapsz);
    >> +	len = nul - p;
    >> +	pad = 8 - len % 8;
    >> +
    >> +	pathlen = len + pad;
    >> +
    >> +	if (mapsz < *offset + pathlen)
    >> +		return got_error(GOT_ERR_FILEIDX_BAD);
    >
    >If the file index is corrupt, the memchr() can overrun the mapped
    >area and trigger a segfault.  The max length should be something
    >like map + mapz - p.
    
    Nice catch! You are right; I've fixed it with mapsz - *offset (which is
    the same N). Thanks, naddy :)
    
    
    diffstat c2ba7aa6808a2583895c024f5c85fff03948494e 1f5993e02a319fbcf02f9dbcd2926397fc7b3462
     M  lib/fileindex.c          |  226+  23-
     M  lib/got_lib_fileindex.h  |    2+   1-
     M  lib/worktree.c           |   36+   5-
     M  lib/worktree_cvg.c       |   37+   5-
    
    4 files changed, 301 insertions(+), 34 deletions(-)
    
    diff c2ba7aa6808a2583895c024f5c85fff03948494e 1f5993e02a319fbcf02f9dbcd2926397fc7b3462
    commit - c2ba7aa6808a2583895c024f5c85fff03948494e
    commit + 1f5993e02a319fbcf02f9dbcd2926397fc7b3462
    blob - e989c6b7cd01f55aa6173b8bbf04441802b11fe6
    blob + 1cf015d945052cca6c3b80265eed6285ce7bfa73
    --- lib/fileindex.c
    +++ lib/fileindex.c
    @@ -21,6 +21,7 @@
     #include <errno.h>
     #include <dirent.h>
     #include <fcntl.h>
    +#include <stddef.h>
     #include <stdio.h>
     #include <stdlib.h>
     #include <string.h>
    @@ -537,6 +538,187 @@ read_fileindex_val64(uint64_t *val, struct got_hash *c
     }
     
     static const struct got_error *
    +mread_fileindex_val64(uint64_t *val, struct got_hash *ctx, const uint8_t *p,
    +    size_t psz, size_t *offset)
    +{
    +	if (psz < *offset + sizeof(*val))
    +		return got_error(GOT_ERR_FILEIDX_BAD);
    +
    +	memcpy(val, p + *offset, sizeof(*val));
    +	got_hash_update(ctx, val, sizeof(*val));
    +	*val = be64toh(*val);
    +	*offset += sizeof(*val);
    +
    +	return NULL;
    +}
    +
    +static const struct got_error *
    +mread_fileindex_val32(uint32_t *val, struct got_hash *ctx, const uint8_t *p,
    +    size_t psz, size_t *offset)
    +{
    +	if (psz < *offset + sizeof(*val))
    +		return got_error(GOT_ERR_FILEIDX_BAD);
    +
    +	memcpy(val, p + *offset, sizeof(*val));
    +	got_hash_update(ctx, val, sizeof(*val));
    +	*val = be32toh(*val);
    +	*offset += sizeof(*val);
    +
    +	return NULL;
    +}
    +
    +static const struct got_error *
    +mread_fileindex_val16(uint16_t *val, struct got_hash *ctx, const uint8_t *p,
    +    size_t psz, size_t *offset)
    +{
    +	if (psz < *offset + sizeof(*val))
    +		return got_error(GOT_ERR_FILEIDX_BAD);
    +
    +	memcpy(val, p + *offset, sizeof(*val));
    +	got_hash_update(ctx, val, sizeof(*val));
    +	*val = be16toh(*val);
    +	*offset += sizeof(*val);
    +
    +	return NULL;
    +}
    +
    +
    +static const struct got_error *
    +mread_fileindex_sha1bin(uint8_t *val, struct got_hash *ctx, const uint8_t *p,
    +    size_t psz, size_t *offset)
    +{
    +	if (psz < *offset + SHA1_DIGEST_LENGTH)
    +		return got_error(GOT_ERR_FILEIDX_BAD);
    +
    +	memcpy(val, p + *offset, SHA1_DIGEST_LENGTH);
    +	got_hash_update(ctx, val, SHA1_DIGEST_LENGTH);
    +	*offset += SHA1_DIGEST_LENGTH;
    +
    +	return NULL;
    +}
    +
    +static const struct got_error *
    +mread_fileindex_path(char **path, struct got_hash *ctx, const uint8_t *p,
    +    size_t psz, size_t *offset)
    +{
    +	const uint8_t	*ptr, *nul;
    +	size_t		 len, pad, pathlen;
    +
    +	if (psz < *offset)
    +		return got_error(GOT_ERR_FILEIDX_BAD);
    +
    +	ptr = p + *offset;
    +
    +	nul = memchr(ptr, '\0', psz - *offset);
    +	len = nul - ptr;
    +	pad = 8 - len % 8;
    +
    +	pathlen = len + pad;
    +
    +	if (psz < *offset + pathlen)
    +		return got_error(GOT_ERR_FILEIDX_BAD);
    +
    +	*path = malloc(pathlen);
    +	if (*path == NULL)
    +		return got_error_from_errno("malloc");
    +
    +	memcpy(*path, ptr, pathlen);
    +	got_hash_update(ctx, *path, pathlen);
    +
    +	*offset += pathlen;
    +
    +	return NULL;
    +}
    +
    +static const struct got_error *
    +mread_fileindex_entry(struct got_fileindex_entry **iep, struct got_hash *ctx,
    +    const uint8_t *ptr, size_t ptrsz, uint32_t version, size_t *offset)
    +{
    +	const struct got_error		*err;
    +	struct got_fileindex_entry	*ie;
    +	const uint8_t			*p;
    +	size_t				 n = 0;
    +
    +	*iep = NULL;
    +
    +	if (ptrsz < *offset)
    +		return got_error(GOT_ERR_FILEIDX_BAD);
    +
    +	p = ptr + *offset;
    +
    +	ie = calloc(1, sizeof(*ie));
    +	if (ie == NULL)
    +		return got_error_from_errno("calloc");
    +
    +	err = mread_fileindex_val64(&ie->ctime_sec, ctx, p, ptrsz, &n);
    +	if (err != NULL)
    +		goto done;
    +	err = mread_fileindex_val64(&ie->ctime_nsec, ctx, p, ptrsz, &n);
    +	if (err != NULL)
    +		goto done;
    +	err = mread_fileindex_val64(&ie->mtime_sec, ctx, p, ptrsz, &n);
    +	if (err != NULL)
    +		goto done;
    +	err = mread_fileindex_val64(&ie->mtime_nsec, ctx, p, ptrsz, &n);
    +	if (err != NULL)
    +		goto done;
    +
    +	err = mread_fileindex_val32(&ie->uid, ctx, p, ptrsz, &n);
    +	if (err != NULL)
    +		goto done;
    +	err = mread_fileindex_val32(&ie->gid, ctx, p, ptrsz, &n);
    +	if (err != NULL)
    +		goto done;
    +	err = mread_fileindex_val32(&ie->size, ctx, p, ptrsz, &n);
    +	if (err != NULL)
    +		goto done;
    +
    +	err = mread_fileindex_val16(&ie->mode, ctx, p, ptrsz, &n);
    +	if (err != NULL)
    +		goto done;
    +
    +	err = mread_fileindex_sha1bin(ie->blob_sha1, ctx, p, ptrsz, &n);
    +	if (err != NULL)
    +		goto done;
    +	err = mread_fileindex_sha1bin(ie->commit_sha1, ctx, p, ptrsz, &n);
    +	if (err != NULL)
    +		goto done;
    +
    +	err = mread_fileindex_val32(&ie->flags, ctx, p, ptrsz, &n);
    +	if (err != NULL)
    +		goto done;
    +
    +	err = mread_fileindex_path(&ie->path, ctx, p, ptrsz, &n);
    +	if (err != NULL)
    +		goto done;
    +
    +	if (version >= 2) {
    +		uint32_t stage;
    +
    +		stage = got_fileindex_entry_stage_get(ie);
    +		if (stage == GOT_FILEIDX_STAGE_MODIFY ||
    +		    stage == GOT_FILEIDX_STAGE_ADD) {
    +			err = mread_fileindex_sha1bin(ie->staged_blob_sha1,
    +			    ctx, p, ptrsz, &n);
    +			if (err != NULL)
    +				goto done;
    +		}
    +	} else {
    +		/* GOT_FILE_INDEX_VERSION 1 does not support staging. */
    +		ie->flags &= ~GOT_FILEIDX_F_STAGE;
    +	}
    +
    +	*offset += n;
    +
    +done:
    +	if (err != NULL)
    +		got_fileindex_entry_free(ie);
    +	else
    +		*iep = ie;
    +	return err;
    +}
    +
    +static const struct got_error *
     read_fileindex_val64(uint64_t *val, struct got_hash *ctx, FILE *infile)
     {
     	size_t n;
    @@ -702,7 +884,8 @@ got_fileindex_read(struct got_fileindex *fileindex, FI
     }
     
     const struct got_error *
    -got_fileindex_read(struct got_fileindex *fileindex, FILE *infile)
    +got_fileindex_read(struct got_fileindex *fileindex, FILE *infile,
    +    const uint8_t *p, size_t psz)
     {
     	const struct got_error *err = NULL;
     	struct got_fileindex_hdr hdr;
    @@ -710,29 +893,38 @@ got_fileindex_read(struct got_fileindex *fileindex, FI
     	struct got_fileindex_entry *ie;
     	uint8_t sha1_expected[SHA1_DIGEST_LENGTH];
     	uint8_t sha1[SHA1_DIGEST_LENGTH];
    -	size_t n;
    +	size_t n, offset;
     	int i;
     
     	got_hash_init(&ctx, GOT_HASH_SHA1);
     
    -	n = fread(&hdr.signature, 1, sizeof(hdr.signature), infile);
    -	if (n != sizeof(hdr.signature)) {
    -		if (n == 0) /* EOF */
    -			return NULL;
    -		return got_ferror(infile, GOT_ERR_FILEIDX_BAD);
    +	if (p != NULL) {
    +		offset = offsetof(struct got_fileindex_hdr, sha1);
    +		if (psz < offset)
    +			return got_error(GOT_ERR_FILEIDX_BAD);
    +
    +		/* copy hdr up to and including nentries */
    +		memcpy(&hdr, p, offset);
    +	} else {
    +		n = fread(&hdr.signature, 1, sizeof(hdr.signature), infile);
    +		if (n != sizeof(hdr.signature)) {
    +			if (n == 0) /* EOF */
    +				return NULL;
    +			return got_ferror(infile, GOT_ERR_FILEIDX_BAD);
    +		}
    +		n = fread(&hdr.version, 1, sizeof(hdr.version), infile);
    +		if (n != sizeof(hdr.version)) {
    +			if (n == 0) /* EOF */
    +				return NULL;
    +			return got_ferror(infile, GOT_ERR_FILEIDX_BAD);
    +		}
    +		n = fread(&hdr.nentries, 1, sizeof(hdr.nentries), infile);
    +		if (n != sizeof(hdr.nentries)) {
    +			if (n == 0) /* EOF */
    +				return NULL;
    +			return got_ferror(infile, GOT_ERR_FILEIDX_BAD);
    +		}
     	}
    -	n = fread(&hdr.version, 1, sizeof(hdr.version), infile);
    -	if (n != sizeof(hdr.version)) {
    -		if (n == 0) /* EOF */
    -			return NULL;
    -		return got_ferror(infile, GOT_ERR_FILEIDX_BAD);
    -	}
    -	n = fread(&hdr.nentries, 1, sizeof(hdr.nentries), infile);
    -	if (n != sizeof(hdr.nentries)) {
    -		if (n == 0) /* EOF */
    -			return NULL;
    -		return got_ferror(infile, GOT_ERR_FILEIDX_BAD);
    -	}
     
     	got_hash_update(&ctx, &hdr.signature, sizeof(hdr.signature));
     	got_hash_update(&ctx, &hdr.version, sizeof(hdr.version));
    @@ -748,7 +940,12 @@ got_fileindex_read(struct got_fileindex *fileindex, FI
     		return got_error(GOT_ERR_FILEIDX_VER);
     
     	for (i = 0; i < hdr.nentries; i++) {
    -		err = read_fileindex_entry(&ie, &ctx, infile, hdr.version);
    +		if (p != NULL)
    +			err = mread_fileindex_entry(&ie, &ctx, p, psz,
    +			    hdr.version, &offset);
    +		else
    +			err = read_fileindex_entry(&ie, &ctx, infile,
    +			    hdr.version);
     		if (err)
     			return err;
     		err = add_entry(fileindex, ie);
    @@ -758,9 +955,15 @@ got_fileindex_read(struct got_fileindex *fileindex, FI
     		}
     	}
     
    -	n = fread(sha1_expected, 1, sizeof(sha1_expected), infile);
    -	if (n != sizeof(sha1_expected))
    -		return got_ferror(infile, GOT_ERR_FILEIDX_BAD);
    +	if (p != NULL) {
    +		if (psz < offset + sizeof(sha1_expected))
    +			return got_error(GOT_ERR_FILEIDX_BAD);
    +		memcpy(sha1_expected, p + offset, sizeof(sha1_expected));
    +	} else {
    +		n = fread(sha1_expected, 1, sizeof(sha1_expected), infile);
    +		if (n != sizeof(sha1_expected))
    +			return got_ferror(infile, GOT_ERR_FILEIDX_BAD);
    +	}
     	got_hash_final(&ctx, sha1);
     	if (memcmp(sha1, sha1_expected, SHA1_DIGEST_LENGTH) != 0)
     		return got_error(GOT_ERR_FILEIDX_CSUM);
    blob - 7045733009e3c274bef6f25b589a001bff3de260
    blob + c2e4498c52366b3ce1d284297d23ff4006158aed
    --- lib/got_lib_fileindex.h
    +++ lib/got_lib_fileindex.h
    @@ -123,7 +123,8 @@ const struct got_error *got_fileindex_read(struct got_
         struct got_fileindex_entry *);
     struct got_fileindex_entry *got_fileindex_entry_get(struct got_fileindex *,
         const char *, size_t);
    -const struct got_error *got_fileindex_read(struct got_fileindex *, FILE *);
    +const struct got_error *got_fileindex_read(struct got_fileindex *, FILE *,
    +    const uint8_t *, size_t);
     typedef const struct got_error *(*got_fileindex_cb)(void *,
         struct got_fileindex_entry *);
     const struct got_error *got_fileindex_for_each_entry_safe(
    blob - 2cdafab5a89da7e31a55bc887af42b14226637d8
    blob + f5ae973cdf877ffbf50ff9e4d4db224a5344ae36
    --- lib/worktree.c
    +++ lib/worktree.c
    @@ -20,8 +20,8 @@
     
     #include <dirent.h>
     #include <limits.h>
    -#include <stddef.h>
     #include <string.h>
    +#include <stdint.h>
     #include <stdio.h>
     #include <stdlib.h>
     #include <time.h>
    @@ -2508,6 +2508,8 @@ open_fileindex(struct got_fileindex **fileindex, char 
     {
     	const struct got_error *err = NULL;
     	FILE *index = NULL;
    +	struct stat st;
    +	uint8_t *p = NULL;
     
     	*fileindex_path = NULL;
     	*fileindex = got_fileindex_alloc();
    @@ -2522,12 +2524,41 @@ open_fileindex(struct got_fileindex **fileindex, char 
     	if (index == NULL) {
     		if (errno != ENOENT)
     			err = got_error_from_errno2("fopen", *fileindex_path);
    -	} else {
    -		err = got_fileindex_read(*fileindex, index);
    -		if (fclose(index) == EOF && err == NULL)
    -			err = got_error_from_errno("fclose");
    +		goto done;
     	}
    +
    +	if (fstat(fileno(index), &st) == -1) {
    +		err = got_error_from_errno2("fstat", *fileindex_path);
    +		goto done;
    +	}
    +
    +	/* XXX if the file is > 1 GB, fallback to file io  */
    +	if (st.st_size > 0 && st.st_size <= 0x3fffffffL) {
    +		ssize_t r;
    +
    +		p = malloc(st.st_size);
    +		if (p == NULL) {
    +			err = got_error_from_errno("malloc");
    +			goto done;
    +		}
    +
    +		r = read(fileno(index), p, st.st_size);
    +		if (r != st.st_size) {
    +			if (r == -1)
    +				err = got_error_from_errno("read");
    +			else
    +				err = got_error(GOT_ERR_IO);
    +			goto done;
    +		}
    +	}
    +
    +	err = got_fileindex_read(*fileindex, index, p, st.st_size);
    +
     done:
    +	if (index != NULL && fclose(index) == EOF && err == NULL)
    +		err = got_error_from_errno("fclose");
    +	if (p != NULL)
    +		free(p);
     	if (err) {
     		free(*fileindex_path);
     		*fileindex_path = NULL;
    blob - f488cd9dcd98b68c647973cb84a54e18daefced4
    blob + 0c0f118bbe3ebca17cf76bf44f243b7a83ff31b4
    --- lib/worktree_cvg.c
    +++ lib/worktree_cvg.c
    @@ -20,8 +20,8 @@
     
     #include <dirent.h>
     #include <limits.h>
    -#include <stddef.h>
     #include <string.h>
    +#include <stdint.h>
     #include <stdio.h>
     #include <stdlib.h>
     #include <time.h>
    @@ -567,6 +567,8 @@ open_fileindex(struct got_fileindex **fileindex, char 
     {
     	const struct got_error *err = NULL;
     	FILE *index = NULL;
    +	struct stat st;
    +	uint8_t *p = NULL;
     
     	*fileindex_path = NULL;
     	*fileindex = got_fileindex_alloc();
    @@ -581,12 +583,41 @@ open_fileindex(struct got_fileindex **fileindex, char 
     	if (index == NULL) {
     		if (errno != ENOENT)
     			err = got_error_from_errno2("fopen", *fileindex_path);
    -	} else {
    -		err = got_fileindex_read(*fileindex, index);
    -		if (fclose(index) == EOF && err == NULL)
    -			err = got_error_from_errno("fclose");
    +		goto done;
     	}
    +
    +	if (fstat(fileno(index), &st) == -1) {
    +		err = got_error_from_errno2("fstat", *fileindex_path);
    +		goto done;
    +	}
    +
    +	/* XXX if the file is > 1 GB, fallback to file io  */
    +	if (st.st_size > 0 && st.st_size <= 0x3fffffffL) {
    +		ssize_t r;
    +
    +		p = malloc(st.st_size);
    +		if (p == NULL) {
    +			err = got_error_from_errno("malloc");
    +			goto done;
    +		}
    +
    +		r = read(fileno(index), p, st.st_size);
    +		if (r != st.st_size) {
    +			if (r == -1)
    +				err = got_error_from_errno("read");
    +			else
    +				err = got_error(GOT_ERR_IO);
    +			goto done;
    +		}
    +	}
    +
    +	err = got_fileindex_read(*fileindex, index, p, st.st_size);
    +
     done:
    +	if (index != NULL && fclose(index) == EOF && err == NULL)
    +		err = got_error_from_errno("fclose");
    +	if (p != NULL)
    +		free(p);
     	if (err) {
     		free(*fileindex_path);
     		*fileindex_path = NULL;
    @@ -596,6 +627,7 @@ static const struct got_error *
     	return err;
     }
     
    +
     static const struct got_error *
     sync_fileindex(struct got_fileindex *fileindex, const char *fileindex_path)
     {
    
    
    -- 
    Mark Jamsek <https://bsdbox.org>
    GPG: F2FF 13DE 6A06 C471 CA80  E6E2 2930 DC66 86EE CF68
    
  • Mark Jamsek:

    optimise reading the file index

  • Christian Weisgerber:

    optimise reading the file index