From: Stefan Sperling Subject: Re: Got is kinda slow To: Christian Weisgerber , gameoftrees@openbsd.org Date: Mon, 17 Apr 2023 13:07:33 +0200 On Sun, Apr 16, 2023 at 05:02:52PM +0200, Stefan Sperling wrote: > On Sun, Apr 16, 2023 at 04:45:40PM +0200, Christian Weisgerber wrote: > > > The OpenBSD README explains how to do > > > a profile build and get a graph of the result. Not sure about -portable. > > > > I'll just copy the repository over to OpenBSD and try a profile > > build there. > > Thanks! Not having to run the profiling myself is a big time saver for me. > > It can be a bit tricky. Make sure to only build one of the binaries with > profiling enabled at a time. Otherwise the gmon.out file will be useless. The profile graph you sent me off-list shows that got-read-pack is spending quite some time in got_object_parse_tree_entry(). This happens because you are specifying a path to 'got log' which then has to filter commits based on changes made to the specified path. So opening and diffing trees becomes necessary. Running 'got log' on the root directory and filtering irrevelant commits externally somehow might be faster. The profiler breaks got_object_parse_tree_entry() run-time down as follows: 80% commit_traversal_request 70% open_tree 37% got_packfile_extract_object_to_mem # boils down to inflate(), nothing can be done here 44% got_object_parse_tree # optimization possible, maybe 20% mergesort -> 16% pte_cmp -> 20% got_path_cmp 16% parse_tree_entry() -> 4% memchr + 4% strnlen Optimizing parse_tree_entry() a little bit more might be possible. The patch below attempts to parse tree entires with just one pass over the buffer instead of calling strnlen and memchr in addition to parsing. But does this really make a significant difference? The mergesort case is too tricky to fix in one patch. The fundamental problem is that the tree entry order Got uses internally is subtly different to the order in which Git expects tree entries to appear in tree objects. For some reason, when a tree entry represents a directory, Git appends an invisible slash ('/') to the tree entry's name before sorting with strncmp order. This results in a situation where e.g. the directory 'got/' and the file got-version.mk will appear in a different order, depending on whether the entries are sorted following Git and Got (pathlist) semantics. At present, Got code which relies on Got's tree parser depends on entries being returned in pure strncmp() sort order, without imaginary trailing slashes. This order affects how trees are traversed while comparing them to on-disk directories in the work tree, for example. If tree entry order doesn't match entries of on-disk directories then this will easily lead to misbehaviour during 'got update' and any other commands which compare the work tree to an in-repository tree. This is why the got_pathlist data structure exists. It took quite some time to reach stable behaviour in this aspect of the code. I'd rather avoid having to touch this again. But I'm afraid the only reasonable way to avoid sorting in the tree parser would be to adapt Git's trailing-slash-on-directories everywhere and enforce this order on all trees and on on-disk directories as well (which are kind of arbitrary I suspect, though usually appear in strncmp order). This could be done. But it is more complicated than what we do now because a got_pathlist would need to know whether or not an entry represents a directory to insert it in the correct position. So this would merely move the problem out of the tree parser and compliate things elsewhere. The basic issue that Git trees do not agree with readdir and basic strncmp() will always remain. And Git is likely stuck with its weird ordering because the order of entries factors into tree object IDs... There are also callers which use binary search to find tree entries, which also rely on strncmp() order and would need to be modified if this assumption about ordering is broken. It would all be doable, just quite a lot of work. Alternatively, as quick fix you could try whether qsort is faster than mergesort on average. But I don't have much hope that this helps much. ----------------------------------------------- rewrite parse_tree_entry() to avoid use of memchr and strnlen diff c736b84ab8efb53399d58afe57a2e40c4c7dd1b5 06d4189c5b19b096a27e312083f2650cd81b611f commit - c736b84ab8efb53399d58afe57a2e40c4c7dd1b5 commit + 06d4189c5b19b096a27e312083f2650cd81b611f blob - 28536b3cc7243ce070909cda1331a2b7750713e0 blob + 6def3e6d9efbb9faf9e0dcc0c112a9b104839833 --- lib/object_parse.c +++ lib/object_parse.c @@ -765,36 +765,44 @@ parse_tree_entry(struct got_parsed_tree_entry *pte, si parse_tree_entry(struct got_parsed_tree_entry *pte, size_t *elen, char *buf, size_t maxlen) { - char *p, *space; + char *p; *elen = 0; - *elen = strnlen(buf, maxlen) + 1; - if (*elen > maxlen) + if (maxlen < SHA1_DIGEST_LENGTH + 3) return got_error(GOT_ERR_BAD_OBJ_DATA); - space = memchr(buf, ' ', *elen); - if (space == NULL || space <= buf) - return got_error(GOT_ERR_BAD_OBJ_DATA); - pte->mode = 0; p = buf; - while (p < space) { + while (p - buf < maxlen && *p != '\0' && *p != ' ') { if (*p < '0' || *p > '7') return got_error(GOT_ERR_BAD_OBJ_DATA); pte->mode <<= 3; pte->mode |= *p - '0'; p++; + (*elen)++; } + if (*elen >= maxlen || *p != ' ') + return got_error(GOT_ERR_BAD_OBJ_DATA); + p++; /* skip space */ + (*elen)++; - if (*elen > maxlen || maxlen - *elen < SHA1_DIGEST_LENGTH) + pte->name = p; + while (*p != '\0' && p - buf < maxlen) + p++; + if (p - buf >= maxlen || *p != '\0') return got_error(GOT_ERR_BAD_OBJ_DATA); - pte->name = space + 1; - pte->namelen = strlen(pte->name); - buf += *elen; - pte->id = buf; - *elen += SHA1_DIGEST_LENGTH; + pte->namelen = p - pte->name; + *elen += pte->namelen; + + p++; /* skip \0 */ + (*elen)++; + + if (*elen + SHA1_DIGEST_LENGTH > maxlen) + return got_error(GOT_ERR_BAD_OBJ_DATA); + pte->id = p; + (*elen) += SHA1_DIGEST_LENGTH; return NULL; }