Download raw body.
tog: keymaps to navigate to prev/next file/hunk in the diff
On 22-07-27 02:16am, Mark Jamsek wrote: > On 22-07-26 02:24pm, Stefan Sperling wrote: > > Sounds good. The more meta-data we can get from the diff library > > "for free", the better. Callers should ideally not have to munge > > the data again to figure out what is where. > > I agree! I'll start on this tomorrow. > > I had a little time tonight so thought I'd take some measurements first > to get an idea of the impact any changes have on performance, and was > surprised with the results. > > I instrumented create_diff() from main [595228385f], my local ezdiffnav > branch with the changes in the above diff, and version [2bea860227] with > the get_filestream_info() function stsp referenced. I invoked `tog > diff` using the patience algorithm on a commit of 675,402 lines (23MB), > and also diffed the same commit via the log view with `tog log`. > > `tog diff` and the diff loaded via `tog log` from both main and my local > branch are comparable (+/- 0.05), but 2bea860227 takes 2x (tog diff) and > 3x (tog log) as long as the others! > > I still think this change should be done in the lib when building the > diff because we can map each line to its type as we save each line > offset thus we collect more data and the second pass isn't required; but > it really surprised me. I haven't looked at the code yet, and I'm > assuming there have been other changes made to the diff since > 2bea860227--the difference can't solely be attributed to > get_filestream_info()--but I didn't expect main and ezdiffnav to be > roughly the same. I might profile tog tomorrow for a better picture. > > tog log (avg) > main: 5269.42 ms (cpu) 6381.70 ms (wall) > ezdiffnav: 5588.65 ms (cpu) 6799.38 ms (wall) > 2bea860227: 19337.42 ms (cpu) 20214.06 ms (wall) > > tog diff (avg) > main: 5229.16 ms (cpu) 6332.12 ms (wall) > ezdiffnav: 5323.60 ms (cpu) 6506.54 ms (wall) > 2bea860227: 11445.84 ms (cpu) 12231.52 ms (wall) I finally had some time this weekend to get to this :) The below diff adds the diff line index mentioned up thread. This is built while the diff is generated so we don't need a second pass. The change is simple: where we previously saved each line's offset, we now also save each line's type. So we have an array of offsets and types to index. The diff also includes some changes to tog where we can now identify lines to colour in the diff view without regex. I included this here for the sake of demonstrating another benefit of having the line type index besides the hunk/file navigation, but I can separate this now for review if you prefer. If ok this would be a couple separate commits, plus the diff code committed to diff.git and then the sync to got. Regarding performance, below the first diff, I've added a second small diff that can be applied on top of the first (or even on to main to test independently); it's a little ARRAY_LIST optimisation that produces a 3x speedup of large diffs. Previously, we would realloc in fixed chunks of 8 or alloc_blocksize--in the relevant case this was 128--while the proposed change uses the 1.5(allocated) scheme, which results in far fewer allocations and a considerable improvement in diff loading times. For example, when using tog built from main or the branch with the proposed line type change, the largest commit I have loads in ~6 seconds--but with this ARRAY_LIST diff, it loads in ~2. diff refs/heads/main fa40c6ac5abc8ce998ce2ab3f3defa0baa10f264 commit - 86b4b772a2c289053be29f6df2ad411dc853e05a commit + fa40c6ac5abc8ce998ce2ab3f3defa0baa10f264 blob - 52a212802c5d2d319bd43f2fa914eb2f48d10b90 blob + 15d9ab77ce628dc6b1939507cdf0e236943d6b82 --- got/got.c +++ got/got.c @@ -3701,7 +3701,7 @@ diff_trees(struct got_object_id *tree_id1, struct got_ arg.force_text_diff = force_text_diff; arg.diff_algo = GOT_DIFF_ALGORITHM_PATIENCE; arg.outfile = outfile; - arg.line_offsets = NULL; + arg.lines = NULL; arg.nlines = 0; while (path[0] == '/') path++; blob - 4243be1c5479449f6d3623e760d1979bd58682b8 blob + 40bc967971c1183c19c91185e153f01017bd3d00 --- include/got_diff.h +++ include/got_diff.h @@ -20,6 +20,31 @@ enum got_diff_algorithm { }; /* + * List of all line types in a diff (including '{got,tog} log' lines). + * XXX GOT_DIFF_LINE_HUNK to GOT_DIFF_LINE_NONE inclusive must map to the + * DIFF_LINE_* macro counterparts defined in lib/diff_output.h (200-203,255). + */ +enum got_diff_line_type { + GOT_DIFF_LINE_LOGMSG, + GOT_DIFF_LINE_AUTHOR, + GOT_DIFF_LINE_DATE, + GOT_DIFF_LINE_CHANGES, + GOT_DIFF_LINE_META, + GOT_DIFF_LINE_BLOB_MIN, + GOT_DIFF_LINE_BLOB_PLUS, + GOT_DIFF_LINE_HUNK = 200, + GOT_DIFF_LINE_MINUS, + GOT_DIFF_LINE_PLUS, + GOT_DIFF_LINE_CONTEXT, + GOT_DIFF_LINE_NONE = 255 +}; + +struct got_diff_line { + off_t offset; + uint8_t type; +}; + +/* * Compute the differences between two blobs and write unified diff text * to the provided output file. Two open temporary files must be provided * for internal use; these files can be obtained from got_opentemp() and @@ -34,7 +59,7 @@ enum got_diff_algorithm { * If not NULL, the two initial output arguments will be populated with an * array of line offsets for, and the number of lines in, the unidiff text. */ -const struct got_error *got_diff_blob(off_t **, size_t *, +const struct got_error *got_diff_blob(struct got_diff_line **, size_t *, struct got_blob_object *, struct got_blob_object *, FILE *, FILE *, const char *, const char *, enum got_diff_algorithm, int, int, int, FILE *); @@ -84,18 +109,18 @@ struct got_diff_blob_output_unidiff_arg { /* * The number of lines contained in produced unidiff text output, - * and an array of byte offsets to each line. May be initialized to - * zero and NULL to ignore line offsets. If not NULL, then the line - * offsets array will be populated. Optionally, the array can be - * pre-populated with line offsets, with nlines > 0 indicating - * the length of the pre-populated array. This is useful if the - * output file already contains some lines of text. - * The array will be grown as needed to accomodate additional line - * offsets, and the last offset found in a pre-populated array will - * be added to all subsequent offsets. + * and an array of got_diff_lines with byte offset and line type to + * each line. May be initialized to zero and NULL to ignore line + * metadata. If not NULL, then the array of line offsets and types will + * be populated. Optionally, the array can be pre-populated with line + * offsets and types, with nlines > 0 indicating the length of the + * pre-populated array. This is useful if the output file already + * contains some lines of text. The array will be grown as needed to + * accomodate additional offsets and types, and the last offset found + * in a pre-populated array will be added to all subsequent offsets. */ size_t nlines; - off_t *line_offsets; /* Dispose of with free(3) when done. */ + struct got_diff_line *lines; /* Dispose of with free(3) when done. */ }; const struct got_error *got_diff_blob_output_unidiff(void *, struct got_blob_object *, struct got_blob_object *, FILE *, FILE *, @@ -155,10 +180,10 @@ const struct got_error *got_diff_tree_collect_changed_ * If not NULL, the two initial output arguments will be populated with an * array of line offsets for, and the number of lines in, the unidiff text. */ -const struct got_error *got_diff_objects_as_blobs(off_t **, size_t *, - FILE *, FILE *, int, int, struct got_object_id *, struct got_object_id *, - const char *, const char *, enum got_diff_algorithm, int, int, int, - struct got_repository *, FILE *); +const struct got_error *got_diff_objects_as_blobs(struct got_diff_line **, + size_t *, FILE *, FILE *, int, int, struct got_object_id *, + struct got_object_id *, const char *, const char *, enum got_diff_algorithm, + int, int, int, struct got_repository *, FILE *); struct got_pathlist_head; @@ -177,10 +202,11 @@ struct got_pathlist_head; * If not NULL, the two initial output arguments will be populated with an * array of line offsets for, and the number of lines in, the unidiff text. */ -const struct got_error *got_diff_objects_as_trees(off_t **, size_t *, - FILE *, FILE *, int, int, struct got_object_id *, struct got_object_id *, - struct got_pathlist_head *, const char *, const char *, - enum got_diff_algorithm, int, int, int, struct got_repository *, FILE *); +const struct got_error *got_diff_objects_as_trees(struct got_diff_line **, + size_t *, FILE *, FILE *, int, int, struct got_object_id *, + struct got_object_id *, struct got_pathlist_head *, const char *, + const char *, enum got_diff_algorithm, int, int, int, + struct got_repository *, FILE *); /* * Diff two objects, assuming both objects are commits. @@ -194,9 +220,9 @@ const struct got_error *got_diff_objects_as_trees(off_ * If not NULL, the two initial output arguments will be populated with an * array of line offsets for, and the number of lines in, the unidiff text. */ -const struct got_error *got_diff_objects_as_commits(off_t **, size_t *, - FILE *, FILE *, int, int, struct got_object_id *, struct got_object_id *, - struct got_pathlist_head *, enum got_diff_algorithm, int, int, int, - struct got_repository *, FILE *); +const struct got_error *got_diff_objects_as_commits(struct got_diff_line **, + size_t *, FILE *, FILE *, int, int, struct got_object_id *, + struct got_object_id *, struct got_pathlist_head *, enum got_diff_algorithm, + int, int, int, struct got_repository *, FILE *); #define GOT_DIFF_MAX_CONTEXT 64 blob - 7268a714345f26eb33869d80941662fdd7688b13 blob + f671a4f105400e52465bc32e860ae0b0b1e71afd --- lib/diff.c +++ lib/diff.c @@ -39,21 +39,24 @@ #include "got_lib_object.h" static const struct got_error * -add_line_offset(off_t **line_offsets, size_t *nlines, off_t off) +add_line_metadata(struct got_diff_line **lines, size_t *nlines, + off_t off, uint8_t type) { - off_t *p; + struct got_diff_line *p; - p = reallocarray(*line_offsets, *nlines + 1, sizeof(off_t)); + p = reallocarray(*lines, *nlines + 1, sizeof(**lines)); if (p == NULL) return got_error_from_errno("reallocarray"); - *line_offsets = p; - (*line_offsets)[*nlines] = off; + *lines = p; + (*lines)[*nlines].offset = off; + (*lines)[*nlines].type = type; (*nlines)++; + return NULL; } static const struct got_error * -diff_blobs(off_t **line_offsets, size_t *nlines, +diff_blobs(struct got_diff_line **lines, size_t *nlines, struct got_diffreg_result **resultp, struct got_blob_object *blob1, struct got_blob_object *blob2, FILE *f1, FILE *f2, const char *label1, const char *label2, mode_t mode1, mode_t mode2, @@ -69,10 +72,10 @@ diff_blobs(off_t **line_offsets, size_t *nlines, off_t outoff = 0; int n; - if (line_offsets && *line_offsets && *nlines > 0) - outoff = (*line_offsets)[*nlines - 1]; - else if (line_offsets) { - err = add_line_offset(line_offsets, nlines, 0); + if (lines && *lines && nlines > 0) + outoff = (*lines)[*nlines - 1].offset; + else if (lines) { + err = add_line_metadata(lines, nlines, 0, GOT_DIFF_LINE_NONE); if (err) goto done; } @@ -141,8 +144,9 @@ diff_blobs(off_t **line_offsets, size_t *nlines, if (n < 0) goto done; outoff += n; - if (line_offsets) { - err = add_line_offset(line_offsets, nlines, outoff); + if (lines) { + err = add_line_metadata(lines, nlines, outoff, + GOT_DIFF_LINE_BLOB_MIN); if (err) goto done; } @@ -152,8 +156,9 @@ diff_blobs(off_t **line_offsets, size_t *nlines, if (n < 0) goto done; outoff += n; - if (line_offsets) { - err = add_line_offset(line_offsets, nlines, outoff); + if (lines) { + err = add_line_metadata(lines, nlines, outoff, + GOT_DIFF_LINE_BLOB_PLUS); if (err) goto done; } @@ -167,7 +172,7 @@ diff_blobs(off_t **line_offsets, size_t *nlines, goto done; if (outfile) { - err = got_diffreg_output(line_offsets, nlines, result, + err = got_diffreg_output(lines, nlines, result, blob1 != NULL, blob2 != NULL, label1 ? label1 : idstr1, label2 ? label2 : idstr2, @@ -196,19 +201,19 @@ got_diff_blob_output_unidiff(void *arg, struct got_blo { struct got_diff_blob_output_unidiff_arg *a = arg; - return diff_blobs(&a->line_offsets, &a->nlines, NULL, + return diff_blobs(&a->lines, &a->nlines, NULL, blob1, blob2, f1, f2, label1, label2, mode1, mode2, a->diff_context, a->ignore_whitespace, a->force_text_diff, a->outfile, a->diff_algo); } const struct got_error * -got_diff_blob(off_t **line_offsets, size_t *nlines, +got_diff_blob(struct got_diff_line **lines, size_t*nlines, struct got_blob_object *blob1, struct got_blob_object *blob2, FILE *f1, FILE *f2, const char *label1, const char *label2, enum got_diff_algorithm diff_algo, int diff_context, int ignore_whitespace, int force_text_diff, FILE *outfile) { - return diff_blobs(line_offsets, nlines, NULL, blob1, blob2, f1, f2, + return diff_blobs(lines, nlines, NULL, blob1, blob2, f1, f2, label1, label2, 0, 0, diff_context, ignore_whitespace, force_text_diff, outfile, diff_algo); } @@ -726,7 +731,7 @@ got_diff_tree(struct got_tree_object *tree1, struct go } const struct got_error * -got_diff_objects_as_blobs(off_t **line_offsets, size_t *nlines, +got_diff_objects_as_blobs(struct got_diff_line **lines, size_t *nlines, FILE *f1, FILE *f2, int fd1, int fd2, struct got_object_id *id1, struct got_object_id *id2, const char *label1, const char *label2, @@ -750,9 +755,9 @@ got_diff_objects_as_blobs(off_t **line_offsets, size_t if (err) goto done; } - err = got_diff_blob(line_offsets, nlines, blob1, blob2, f1, f2, - label1, label2, diff_algo, diff_context, ignore_whitespace, - force_text_diff, outfile); + err = got_diff_blob(lines, nlines, blob1, blob2, f1, f2, label1, label2, + diff_algo, diff_context, ignore_whitespace, force_text_diff, + outfile); done: if (blob1) got_object_blob_close(blob1); @@ -890,24 +895,26 @@ done: } static const struct got_error * -show_object_id(off_t **line_offsets, size_t *nlines, const char *obj_typestr, - int ch, const char *id_str, FILE *outfile) +show_object_id(struct got_diff_line **lines, size_t *nlines, + const char *obj_typestr, int ch, const char *id_str, FILE *outfile) { const struct got_error *err; int n; off_t outoff = 0; n = fprintf(outfile, "%s %c %s\n", obj_typestr, ch, id_str); - if (line_offsets != NULL && *line_offsets != NULL) { + if (lines != NULL && *lines != NULL) { if (*nlines == 0) { - err = add_line_offset(line_offsets, nlines, 0); + err = add_line_metadata(lines, nlines, 0, + GOT_DIFF_LINE_META); if (err) return err; } else - outoff = (*line_offsets)[*nlines - 1]; + outoff = (*lines)[*nlines - 1].offset; outoff += n; - err = add_line_offset(line_offsets, nlines, outoff); + err = add_line_metadata(lines, nlines, outoff, + GOT_DIFF_LINE_META); if (err) return err; } @@ -916,7 +923,7 @@ show_object_id(off_t **line_offsets, size_t *nlines, c } static const struct got_error * -diff_objects_as_trees(off_t **line_offsets, size_t *nlines, +diff_objects_as_trees(struct got_diff_line **lines, size_t *nlines, FILE *f1, FILE *f2, int fd1, int fd2, struct got_object_id *id1, struct got_object_id *id2, struct got_pathlist_head *paths, const char *label1, const char *label2, @@ -927,7 +934,7 @@ diff_objects_as_trees(off_t **line_offsets, size_t *nl const struct got_error *err; struct got_tree_object *tree1 = NULL, *tree2 = NULL; struct got_diff_blob_output_unidiff_arg arg; - int want_lineoffsets = (line_offsets != NULL && *line_offsets != NULL); + int want_linemeta = (lines != NULL && *lines != NULL); if (id1 == NULL && id2 == NULL) return got_error(GOT_ERR_NO_OBJ); @@ -948,11 +955,11 @@ diff_objects_as_trees(off_t **line_offsets, size_t *nl arg.ignore_whitespace = ignore_whitespace; arg.force_text_diff = force_text_diff; arg.outfile = outfile; - if (want_lineoffsets) { - arg.line_offsets = *line_offsets; + if (want_linemeta) { + arg.lines = *lines; arg.nlines = *nlines; } else { - arg.line_offsets = NULL; + arg.lines = NULL; arg.nlines = 0; } if (paths == NULL || TAILQ_EMPTY(paths)) { @@ -963,8 +970,8 @@ diff_objects_as_trees(off_t **line_offsets, size_t *nl err = diff_paths(tree1, tree2, f1, f2, fd1, fd2, paths, repo, got_diff_blob_output_unidiff, &arg); } - if (want_lineoffsets) { - *line_offsets = arg.line_offsets; /* was likely re-allocated */ + if (want_linemeta) { + *lines = arg.lines; /* was likely re-allocated */ *nlines = arg.nlines; } done: @@ -976,7 +983,7 @@ done: } const struct got_error * -got_diff_objects_as_trees(off_t **line_offsets, size_t *nlines, +got_diff_objects_as_trees(struct got_diff_line **lines, size_t *nlines, FILE *f1, FILE *f2, int fd1, int fd2, struct got_object_id *id1, struct got_object_id *id2, struct got_pathlist_head *paths, const char *label1, const char *label2, @@ -993,15 +1000,14 @@ got_diff_objects_as_trees(off_t **line_offsets, size_t err = got_object_id_str(&idstr, id1); if (err) goto done; - err = show_object_id(line_offsets, nlines, "tree", '-', - idstr, outfile); + err = show_object_id(lines, nlines, "tree", '-', idstr, outfile); if (err) goto done; free(idstr); idstr = NULL; } else { - err = show_object_id(line_offsets, nlines, "tree", '-', - "/dev/null", outfile); + err = show_object_id(lines, nlines, "tree", '-', "/dev/null", + outfile); if (err) goto done; } @@ -1010,21 +1016,20 @@ got_diff_objects_as_trees(off_t **line_offsets, size_t err = got_object_id_str(&idstr, id2); if (err) goto done; - err = show_object_id(line_offsets, nlines, "tree", '+', - idstr, outfile); + err = show_object_id(lines, nlines, "tree", '+', idstr, outfile); if (err) goto done; free(idstr); idstr = NULL; } else { - err = show_object_id(line_offsets, nlines, "tree", '+', - "/dev/null", outfile); + err = show_object_id(lines, nlines, "tree", '+', "/dev/null", + outfile); if (err) goto done; } - err = diff_objects_as_trees(line_offsets, nlines, f1, f2, fd1, fd2, - id1, id2, paths, label1, label2, diff_context, ignore_whitespace, + err = diff_objects_as_trees(lines, nlines, f1, f2, fd1, fd2, id1, id2, + paths, label1, label2, diff_context, ignore_whitespace, force_text_diff, repo, outfile, diff_algo); done: free(idstr); @@ -1032,7 +1037,7 @@ done: } const struct got_error * -got_diff_objects_as_commits(off_t **line_offsets, size_t *nlines, +got_diff_objects_as_commits(struct got_diff_line **lines, size_t *nlines, FILE *f1, FILE *f2, int fd1, int fd2, struct got_object_id *id1, struct got_object_id *id2, struct got_pathlist_head *paths, enum got_diff_algorithm diff_algo, @@ -1053,15 +1058,15 @@ got_diff_objects_as_commits(off_t **line_offsets, size err = got_object_id_str(&idstr, id1); if (err) goto done; - err = show_object_id(line_offsets, nlines, "commit", '-', - idstr, outfile); + err = show_object_id(lines, nlines, "commit", '-', idstr, + outfile); if (err) goto done; free(idstr); idstr = NULL; } else { - err = show_object_id(line_offsets, nlines, "commit", '-', - "/dev/null", outfile); + err = show_object_id(lines, nlines, "commit", '-', "/dev/null", + outfile); if (err) goto done; } @@ -1073,12 +1078,11 @@ got_diff_objects_as_commits(off_t **line_offsets, size err = got_object_id_str(&idstr, id2); if (err) goto done; - err = show_object_id(line_offsets, nlines, "commit", '+', - idstr, outfile); + err = show_object_id(lines, nlines, "commit", '+', idstr, outfile); if (err) goto done; - err = diff_objects_as_trees(line_offsets, nlines, f1, f2, fd1, fd2, + err = diff_objects_as_trees(lines, nlines, f1, f2, fd1, fd2, commit1 ? got_object_commit_get_tree_id(commit1) : NULL, got_object_commit_get_tree_id(commit2), paths, "", "", diff_context, ignore_whitespace, force_text_diff, repo, outfile, blob - e44ac0a4e3f7fe9caf5b350a89c7e5c5beba00be blob + f9f748fd298e5716a0fbfa2a0ff0277abc7a331e --- lib/diff_output.c +++ lib/diff_output.c @@ -64,6 +64,7 @@ diff_output_lines(struct diff_output_info *outinfo, FI { struct diff_atom *atom; off_t outoff = 0, *offp; + uint8_t *typep; int rc; if (outinfo && outinfo->line_offsets.len > 0) { @@ -122,6 +123,12 @@ diff_output_lines(struct diff_output_info *outinfo, FI return ENOMEM; outoff += outlen; *offp = outoff; + ARRAYLIST_ADD(typep, outinfo->line_types); + if (typep == NULL) + return ENOMEM; + *typep = *prefix == ' ' ? DIFF_LINE_CONTEXT : + *prefix == '-' ? DIFF_LINE_MINUS : + *prefix == '+' ? DIFF_LINE_PLUS : DIFF_LINE_NONE; } } @@ -207,7 +214,9 @@ diff_output_trailing_newline_msg(struct diff_output_in unsigned int atom_count; int rc, ch; off_t outoff = 0, *offp; + uint8_t *typep; + if (chunk_type == CHUNK_MINUS || chunk_type == CHUNK_SAME) { start_atom = c->left_start; atom_count = c->left_count; @@ -240,6 +249,10 @@ diff_output_trailing_newline_msg(struct diff_output_in return ENOMEM; outoff += rc; *offp = outoff; + ARRAYLIST_ADD(typep, outinfo->line_types); + if (typep == NULL) + return ENOMEM; + *typep = DIFF_LINE_NONE; } } @@ -316,6 +329,7 @@ diff_output_info_alloc(void) { struct diff_output_info *output_info; off_t *offp; + uint8_t *typep; output_info = malloc(sizeof(*output_info)); if (output_info != NULL) { @@ -326,6 +340,13 @@ diff_output_info_alloc(void) return NULL; } *offp = 0; + ARRAYLIST_INIT(output_info->line_types, 128); + ARRAYLIST_ADD(typep, output_info->line_types); + if (typep == NULL) { + diff_output_info_free(output_info); + return NULL; + } + *typep = DIFF_LINE_NONE; } return output_info; } @@ -334,6 +355,7 @@ void diff_output_info_free(struct diff_output_info *output_info) { ARRAYLIST_FREE(output_info->line_offsets); + ARRAYLIST_FREE(output_info->line_types); free(output_info); } blob - 6cd959e4159e40e5fcc467e4f69c7318fd54d3d5 blob + 3da0da5e6cb2063bdc8dfeb0b688ad087b1c9053 --- lib/diff_output.h +++ lib/diff_output.h @@ -32,6 +32,16 @@ struct diff_output_info { * The last offset in this array corresponds to end-of-file. */ ARRAYLIST(off_t) line_offsets; + /* + * Type (i.e., context, minus, plus) of each line generated by the diff. + * nb. 0x00 to 0xc7 reserved for client-defined line types. + */ + ARRAYLIST(uint8_t) line_types; +#define DIFF_LINE_HUNK 0xc8 +#define DIFF_LINE_MINUS 0xc9 +#define DIFF_LINE_PLUS 0xca +#define DIFF_LINE_CONTEXT 0xcb +#define DIFF_LINE_NONE 0xff /* binary or no EOF newline msg, etc. */ }; void diff_output_info_free(struct diff_output_info *output_info); blob - f3f4ad74c807d77b28b7b0af68f0d0ef2435d5c1 blob + 18bff74063081228c1c525cce2b35ad26a527bf8 --- lib/diff_output_unidiff.c +++ lib/diff_output_unidiff.c @@ -229,6 +229,7 @@ output_unidiff_chunk(struct diff_output_info *outinfo, { int rc, left_start, left_len, right_start, right_len; off_t outoff = 0, *offp; + uint8_t *typep; if (diff_range_empty(&cc->left) && diff_range_empty(&cc->right)) return DIFF_RC_OK; @@ -249,7 +250,10 @@ output_unidiff_chunk(struct diff_output_info *outinfo, return ENOMEM; outoff += rc; *offp = outoff; - + ARRAYLIST_ADD(typep, outinfo->line_types); + if (typep == NULL) + return ENOMEM; + *typep = DIFF_LINE_MINUS; } rc = fprintf(dest, "+++ %s\n", diff_output_get_label_right(info)); @@ -261,7 +265,10 @@ output_unidiff_chunk(struct diff_output_info *outinfo, return ENOMEM; outoff += rc; *offp = outoff; - + ARRAYLIST_ADD(typep, outinfo->line_types); + if (typep == NULL) + return ENOMEM; + *typep = DIFF_LINE_PLUS; } state->header_printed = true; } @@ -319,7 +326,10 @@ output_unidiff_chunk(struct diff_output_info *outinfo, return ENOMEM; outoff += rc; *offp = outoff; - + ARRAYLIST_ADD(typep, outinfo->line_types); + if (typep == NULL) + return ENOMEM; + *typep = DIFF_LINE_HUNK; } /* Got the absolute line numbers where to start printing, and the index @@ -426,6 +436,7 @@ diff_output_unidiff(struct diff_output_info **output_i bool force_text = (flags & DIFF_FLAG_FORCE_TEXT_DATA); bool have_binary = (atomizer_flags & DIFF_ATOMIZER_FOUND_BINARY_DATA); off_t outoff = 0, *offp; + uint8_t *typep; int rc, i; if (!result) @@ -463,7 +474,10 @@ diff_output_unidiff(struct diff_output_info **output_i return ENOMEM; outoff += rc; *offp = outoff; - + ARRAYLIST_ADD(typep, outinfo->line_types); + if (typep == NULL) + return ENOMEM; + *typep = DIFF_LINE_NONE; } break; } blob - 30c7388ad74a1c5e2b958cf79a9f5c1790de245d blob + ab75c9fa52c19c7f3190c186c3002f46ee627c7e --- lib/diffreg.c +++ lib/diffreg.c @@ -251,7 +251,7 @@ done: } const struct got_error * -got_diffreg_output(off_t **line_offsets, size_t *nlines, +got_diffreg_output(struct got_diff_line **lines, size_t *nlines, struct got_diffreg_result *diff_result, int f1_exists, int f2_exists, const char *path1, const char *path2, enum got_diff_output_format output_format, int context_lines, FILE *outfile) @@ -272,13 +272,13 @@ got_diffreg_output(off_t **line_offsets, size_t *nline switch (output_format) { case GOT_DIFF_OUTPUT_UNIDIFF: rc = diff_output_unidiff( - line_offsets ? &output_info : NULL, outfile, &info, + lines ? &output_info : NULL, outfile, &info, diff_result->result, context_lines); if (rc != DIFF_RC_OK) return got_error_set_errno(rc, "diff_output_unidiff"); break; case GOT_DIFF_OUTPUT_EDSCRIPT: - rc = diff_output_edscript(line_offsets ? &output_info : NULL, + rc = diff_output_edscript(lines ? &output_info : NULL, outfile, &info, diff_result->result); if (rc != DIFF_RC_OK) return got_error_set_errno(rc, "diff_output_edscript"); @@ -286,29 +286,34 @@ got_diffreg_output(off_t **line_offsets, size_t *nline } - if (line_offsets && *line_offsets) { + if (lines && *lines) { if (output_info->line_offsets.len > 0) { - off_t prev_offset = 0, *p, *o; + struct got_diff_line *p; + off_t prev_offset = 0, *o; + uint8_t *o2; int i, len; if (*nlines > 0) { - prev_offset = (*line_offsets)[*nlines - 1]; + prev_offset = (*lines)[*nlines - 1].offset; /* * First line offset is always zero. Skip it * when appending to a pre-populated array. */ o = &output_info->line_offsets.head[1]; + o2 = &output_info->line_types.head[1]; len = output_info->line_offsets.len - 1; } else { o = &output_info->line_offsets.head[0]; + o2 = &output_info->line_types.head[0]; len = output_info->line_offsets.len; } - p = reallocarray(*line_offsets, *nlines + len, - sizeof(off_t)); + p = reallocarray(*lines, *nlines + len, sizeof(**lines)); if (p == NULL) return got_error_from_errno("calloc"); - for (i = 0; i < len; i++) - p[*nlines + i] = o[i] + prev_offset; - *line_offsets = p; + for (i = 0; i < len; i++) { + p[*nlines + i].offset = o[i] + prev_offset; + p[*nlines + i].type = o2[i]; + } + *lines = p; *nlines += len; } diff_output_info_free(output_info); blob - 3ff8d96891722f9a89a5245a2fb9b5e5c8424e83 blob + 43c7096d0a3fe2bb611660d822341235f3223a17 --- lib/got_lib_diff.h +++ lib/got_lib_diff.h @@ -44,7 +44,7 @@ const struct got_error *got_diff_prepare_file(FILE *, struct diff_data *, const struct diff_config *, int, int); const struct got_error *got_diffreg(struct got_diffreg_result **, FILE *, FILE *, enum got_diff_algorithm, int, int); -const struct got_error *got_diffreg_output(off_t **, size_t *, +const struct got_error *got_diffreg_output(struct got_diff_line **, size_t *, struct got_diffreg_result *, int, int, const char *, const char *, enum got_diff_output_format, int, FILE *); const struct got_error *got_diffreg_result_free(struct got_diffreg_result *); blob - c09dbe11bb6e0224ed0138a35da047947c82f41d blob + 21f349fab0529b88773681976e3bb44121f1de09 --- tog/tog.1 +++ tog/tog.1 @@ -290,6 +290,14 @@ Scroll up N half pages (default: 1). Scroll to the top of the view. .It Cm End, G Scroll to the bottom of the view. +.It Cm \&( +Navigate to the Nth previous file in the diff (default: 1). +.It Cm \&) +Navigate to the Nth next file in the diff (default: 1). +.It Cm \&{ +Navigate to the Nth previous hunk in the diff (default: 1). +.It Cm \&} +Navigate to the Nth next hunk in the diff (default: 1). .It Cm \&[ Reduce diff context by N lines (default: 1). .It Cm \&] blob - 58990f76b31337bb2ba1c56ebcce5b3b13a3cb78 blob + af073ba6a6657ef99e64a176574997926ff63e2e --- tog/tog.c +++ tog/tog.c @@ -313,12 +313,12 @@ get_color_value(const char *envvar) return default_color_value(envvar); } - struct tog_diff_view_state { struct got_object_id *id1, *id2; const char *label1, *label2; FILE *f, *f1, *f2; int fd1, fd2; + int lineno; int first_displayed_line; int last_displayed_line; int eof; @@ -326,9 +326,8 @@ struct tog_diff_view_state { int ignore_whitespace; int force_text_diff; struct got_repository *repo; - struct tog_colors colors; + struct got_diff_line *lines; size_t nlines; - off_t *line_offsets; int matched_line; int selected_line; @@ -3849,14 +3848,15 @@ draw_file(struct tog_view *view, const char *header) char *line; size_t linesize = 0; ssize_t linelen; - struct tog_color *tc; wchar_t *wline; int width; int max_lines = view->nlines; int nlines = s->nlines; off_t line_offset; + enum got_diff_line_type linetype; - line_offset = s->line_offsets[s->first_displayed_line - 1]; + s->lineno = s->first_displayed_line - 1; + line_offset = s->lines[s->first_displayed_line - 1].offset; if (fseeko(s->f, line_offset, SEEK_SET) == -1) return got_error_from_errno("fseek"); @@ -3892,6 +3892,8 @@ draw_file(struct tog_view *view, const char *header) view->maxx = 0; line = NULL; while (max_lines > 0 && nprinted < max_lines) { + attr_t attr = 0; + linelen = getline(&line, &linesize, s->f); if (linelen == -1) { if (feof(s->f)) { @@ -3902,6 +3904,8 @@ draw_file(struct tog_view *view, const char *header) return got_ferror(s->f, GOT_ERR_IO); } + if (++s->lineno < s->first_displayed_line) + continue; /* Set view->maxx based on full line length. */ err = format_line(&wline, &width, NULL, line, 0, INT_MAX, 0, view->x ? 1 : 0); @@ -3913,10 +3917,12 @@ draw_file(struct tog_view *view, const char *header) free(wline); wline = NULL; - tc = match_color(&s->colors, line); - if (tc) - wattr_on(view->window, - COLOR_PAIR(tc->colorpair), NULL); + linetype = s->lines[s->lineno].type; + if (linetype > GOT_DIFF_LINE_LOGMSG && + linetype < GOT_DIFF_LINE_CONTEXT) + attr = COLOR_PAIR(linetype); + if (attr) + wattron(view->window, attr); if (s->first_displayed_line + nprinted == s->matched_line && regmatch->rm_so >= 0 && regmatch->rm_so < regmatch->rm_eo) { err = add_matched_line(&width, line, view->ncols, 0, @@ -3937,9 +3943,8 @@ draw_file(struct tog_view *view, const char *header) free(wline); wline = NULL; } - if (tc) - wattr_off(view->window, - COLOR_PAIR(tc->colorpair), NULL); + if (attr) + wattroff(view->window, attr); if (width <= view->ncols - 1) waddch(view->window, '\n'); nprinted++; @@ -4043,21 +4048,24 @@ done: } static const struct got_error * -add_line_offset(off_t **line_offsets, size_t *nlines, off_t off) +add_line_metadata(struct got_diff_line **lines, size_t *nlines, + off_t off, uint8_t type) { - off_t *p; + struct got_diff_line *p; - p = reallocarray(*line_offsets, *nlines + 1, sizeof(off_t)); + p = reallocarray(*lines, *nlines + 1, sizeof(**lines)); if (p == NULL) return got_error_from_errno("reallocarray"); - *line_offsets = p; - (*line_offsets)[*nlines] = off; + *lines = p; + (*lines)[*nlines].offset = off; + (*lines)[*nlines].type = type; (*nlines)++; + return NULL; } static const struct got_error * -write_commit_info(off_t **line_offsets, size_t *nlines, +write_commit_info(struct got_diff_line **lines, size_t *nlines, struct got_object_id *commit_id, struct got_reflist_head *refs, struct got_repository *repo, FILE *outfile) { @@ -4091,7 +4099,7 @@ write_commit_info(off_t **line_offsets, size_t *nlines goto done; } - err = add_line_offset(line_offsets, nlines, 0); + err = add_line_metadata(lines, nlines, 0, GOT_DIFF_LINE_NONE); if (err) goto done; @@ -4102,7 +4110,7 @@ write_commit_info(off_t **line_offsets, size_t *nlines goto done; } outoff += n; - err = add_line_offset(line_offsets, nlines, outoff); + err = add_line_metadata(lines, nlines, outoff, GOT_DIFF_LINE_META); if (err) goto done; @@ -4113,7 +4121,7 @@ write_commit_info(off_t **line_offsets, size_t *nlines goto done; } outoff += n; - err = add_line_offset(line_offsets, nlines, outoff); + err = add_line_metadata(lines, nlines, outoff, GOT_DIFF_LINE_AUTHOR); if (err) goto done; @@ -4126,7 +4134,8 @@ write_commit_info(off_t **line_offsets, size_t *nlines goto done; } outoff += n; - err = add_line_offset(line_offsets, nlines, outoff); + err = add_line_metadata(lines, nlines, outoff, + GOT_DIFF_LINE_DATE); if (err) goto done; } @@ -4139,7 +4148,8 @@ write_commit_info(off_t **line_offsets, size_t *nlines goto done; } outoff += n; - err = add_line_offset(line_offsets, nlines, outoff); + err = add_line_metadata(lines, nlines, outoff, + GOT_DIFF_LINE_AUTHOR); if (err) goto done; } @@ -4158,7 +4168,8 @@ write_commit_info(off_t **line_offsets, size_t *nlines goto done; } outoff += n; - err = add_line_offset(line_offsets, nlines, outoff); + err = add_line_metadata(lines, nlines, outoff, + GOT_DIFF_LINE_META); if (err) goto done; free(id_str); @@ -4177,7 +4188,8 @@ write_commit_info(off_t **line_offsets, size_t *nlines goto done; } outoff += n; - err = add_line_offset(line_offsets, nlines, outoff); + err = add_line_metadata(lines, nlines, outoff, + GOT_DIFF_LINE_LOGMSG); if (err) goto done; } @@ -4193,7 +4205,8 @@ write_commit_info(off_t **line_offsets, size_t *nlines goto done; } outoff += n; - err = add_line_offset(line_offsets, nlines, outoff); + err = add_line_metadata(lines, nlines, outoff, + GOT_DIFF_LINE_CHANGES); if (err) goto done; free((char *)pe->path); @@ -4202,7 +4215,7 @@ write_commit_info(off_t **line_offsets, size_t *nlines fputc('\n', outfile); outoff++; - err = add_line_offset(line_offsets, nlines, outoff); + err = add_line_metadata(lines, nlines, outoff, GOT_DIFF_LINE_NONE); done: got_pathlist_free(&changed_paths); free(id_str); @@ -4210,8 +4223,8 @@ done: free(refs_str); got_object_commit_close(commit); if (err) { - free(*line_offsets); - *line_offsets = NULL; + free(*lines); + *lines = NULL; *nlines = 0; } return err; @@ -4224,9 +4237,9 @@ create_diff(struct tog_diff_view_state *s) FILE *f = NULL; int obj_type; - free(s->line_offsets); - s->line_offsets = malloc(sizeof(off_t)); - if (s->line_offsets == NULL) + free(s->lines); + s->lines = malloc(sizeof(*s->lines)); + if (s->lines == NULL) return got_error_from_errno("malloc"); s->nlines = 0; @@ -4250,13 +4263,13 @@ create_diff(struct tog_diff_view_state *s) switch (obj_type) { case GOT_OBJ_TYPE_BLOB: - err = got_diff_objects_as_blobs(&s->line_offsets, &s->nlines, + err = got_diff_objects_as_blobs(&s->lines, &s->nlines, s->f1, s->f2, s->fd1, s->fd2, s->id1, s->id2, s->label1, s->label2, tog_diff_algo, s->diff_context, s->ignore_whitespace, s->force_text_diff, s->repo, s->f); break; case GOT_OBJ_TYPE_TREE: - err = got_diff_objects_as_trees(&s->line_offsets, &s->nlines, + err = got_diff_objects_as_trees(&s->lines, &s->nlines, s->f1, s->f2, s->fd1, s->fd2, s->id1, s->id2, NULL, "", "", tog_diff_algo, s->diff_context, s->ignore_whitespace, s->force_text_diff, s->repo, s->f); @@ -4273,17 +4286,17 @@ create_diff(struct tog_diff_view_state *s) refs = got_reflist_object_id_map_lookup(tog_refs_idmap, s->id2); /* Show commit info if we're diffing to a parent/root commit. */ if (s->id1 == NULL) { - err = write_commit_info(&s->line_offsets, &s->nlines, - s->id2, refs, s->repo, s->f); + err = write_commit_info(&s->lines, &s->nlines, s->id2, + refs, s->repo, s->f); if (err) goto done; } else { parent_ids = got_object_commit_get_parent_ids(commit2); STAILQ_FOREACH(pid, parent_ids, entry) { if (got_object_id_cmp(s->id1, &pid->id) == 0) { - err = write_commit_info( - &s->line_offsets, &s->nlines, - s->id2, refs, s->repo, s->f); + err = write_commit_info(&s->lines, + &s->nlines, s->id2, refs, s->repo, + s->f); if (err) goto done; break; @@ -4292,7 +4305,7 @@ create_diff(struct tog_diff_view_state *s) } got_object_commit_close(commit2); - err = got_diff_objects_as_commits(&s->line_offsets, &s->nlines, + err = got_diff_objects_as_commits(&s->lines, &s->nlines, s->f1, s->f2, s->fd1, s->fd2, s->id1, s->id2, NULL, tog_diff_algo, s->diff_context, s->ignore_whitespace, s->force_text_diff, s->repo, s->f); @@ -4302,8 +4315,6 @@ create_diff(struct tog_diff_view_state *s) err = got_error(GOT_ERR_OBJ_TYPE); break; } - if (err) - goto done; done: if (s->f && fflush(s->f) != 0 && err == NULL) err = got_error_from_errno("fflush"); @@ -4365,7 +4376,7 @@ search_next_diff_view(struct tog_view *view) lineno = s->nlines; } - offset = s->line_offsets[lineno - 1]; + offset = s->lines[lineno - 1].offset; if (fseeko(s->f, offset, SEEK_SET) != 0) { free(line); return got_error_from_errno("fseeko"); @@ -4425,9 +4436,8 @@ close_diff_view(struct tog_view *view) if (s->fd2 != -1 && close(s->fd2) == -1 && err == NULL) err = got_error_from_errno("close"); s->fd2 = -1; - free_colors(&s->colors); - free(s->line_offsets); - s->line_offsets = NULL; + free(s->lines); + s->lines = NULL; s->nlines = 0; return err; } @@ -4511,43 +4521,25 @@ open_diff_view(struct tog_view *view, struct got_objec s->parent_view = parent_view; s->repo = repo; - STAILQ_INIT(&s->colors); if (has_colors() && getenv("TOG_COLORS") != NULL) { - err = add_color(&s->colors, - "^-", TOG_COLOR_DIFF_MINUS, - get_color_value("TOG_COLOR_DIFF_MINUS")); - if (err) - goto done; - err = add_color(&s->colors, "^\\+", - TOG_COLOR_DIFF_PLUS, - get_color_value("TOG_COLOR_DIFF_PLUS")); - if (err) - goto done; - err = add_color(&s->colors, - "^@@", TOG_COLOR_DIFF_CHUNK_HEADER, - get_color_value("TOG_COLOR_DIFF_CHUNK_HEADER")); - if (err) - goto done; - - err = add_color(&s->colors, - "^(commit [0-9a-f]|parent [0-9]|" - "(blob|file|tree|commit) [-+] |" - "[MDmA] [^ ])", TOG_COLOR_DIFF_META, - get_color_value("TOG_COLOR_DIFF_META")); - if (err) - goto done; - - err = add_color(&s->colors, - "^(from|via): ", TOG_COLOR_AUTHOR, - get_color_value("TOG_COLOR_AUTHOR")); - if (err) - goto done; - - err = add_color(&s->colors, - "^date: ", TOG_COLOR_DATE, - get_color_value("TOG_COLOR_DATE")); - if (err) - goto done; + init_pair(GOT_DIFF_LINE_MINUS, + get_color_value("TOG_COLOR_DIFF_MINUS"), -1); + init_pair(GOT_DIFF_LINE_PLUS, + get_color_value("TOG_COLOR_DIFF_PLUS"), -1); + init_pair(GOT_DIFF_LINE_HUNK, + get_color_value("TOG_COLOR_DIFF_CHUNK_HEADER"), -1); + init_pair(GOT_DIFF_LINE_META, + get_color_value("TOG_COLOR_DIFF_META"), -1); + init_pair(GOT_DIFF_LINE_CHANGES, + get_color_value("TOG_COLOR_DIFF_META"), -1); + init_pair(GOT_DIFF_LINE_BLOB_MIN, + get_color_value("TOG_COLOR_DIFF_META"), -1); + init_pair(GOT_DIFF_LINE_BLOB_PLUS, + get_color_value("TOG_COLOR_DIFF_META"), -1); + init_pair(GOT_DIFF_LINE_AUTHOR, + get_color_value("TOG_COLOR_AUTHOR"), -1); + init_pair(GOT_DIFF_LINE_DATE, + get_color_value("TOG_COLOR_DATE"), -1); } if (parent_view && parent_view->type == TOG_VIEW_LOG && @@ -4643,6 +4635,36 @@ reset_diff_view(struct tog_view *view) return create_diff(s); } +static void +diff_prev_index(struct tog_diff_view_state *s, enum got_diff_line_type type) +{ + int i = s->first_displayed_line - 1; + + while (s->lines[i].type != type) { + if (i == 0) + i = s->nlines - 1; + --i; + } + + s->selected_line = 1; + s->first_displayed_line = i; +} + +static void +diff_next_index(struct tog_diff_view_state *s, enum got_diff_line_type type) +{ + int i = s->first_displayed_line + 1; + + while (s->lines[i].type != type) { + if (i == s->nlines - 1) + i = 0; + ++i; + } + + s->selected_line = 1; + s->first_displayed_line = i; +} + static struct got_object_id *get_selected_commit_id(struct tog_blame_line *, int, int, int); static struct got_object_id *get_annotation_for_line(struct tog_blame_line *, @@ -4660,6 +4682,8 @@ input_diff_view(struct tog_view **new_view, struct tog ssize_t linelen; int i, nscroll = view->nlines - 1, up = 0; + s->lineno = s->first_displayed_line - 1 + s->selected_line; + switch (ch) { case '0': view->x = 0; @@ -4760,6 +4784,18 @@ input_diff_view(struct tog_view **new_view, struct tog } free(line); break; + case '(': + diff_prev_index(s, GOT_DIFF_LINE_BLOB_MIN); + break; + case ')': + diff_next_index(s, GOT_DIFF_LINE_BLOB_MIN); + break; + case '{': + diff_prev_index(s, GOT_DIFF_LINE_HUNK); + break; + case '}': + diff_next_index(s, GOT_DIFF_LINE_HUNK); + break; case '[': if (s->diff_context > 0) { s->diff_context--; ----------------------------------------------- commit 97249a619ed52d94faed9ac1e48448bd8cfd50bc (dev/dlines) from: Mark Jamsek <mark@jamsek.dev> date: Sun Jul 31 12:13:46 2022 UTC ARRAY_LIST allocation optimisation (3x speedup on large diffs) diff fa40c6ac5abc8ce998ce2ab3f3defa0baa10f264 97249a619ed52d94faed9ac1e48448bd8cfd50bc commit - fa40c6ac5abc8ce998ce2ab3f3defa0baa10f264 commit + 97249a619ed52d94faed9ac1e48448bd8cfd50bc blob - 112da0c2e234000d38b2dea5c3b1112fa045b7a0 blob + 8b503d2f97ddd5ce74010697dc9c40b1e8a4b7af --- lib/arraylist.h +++ lib/arraylist.h @@ -63,14 +63,18 @@ (ARRAY_LIST).p = recallocarray((ARRAY_LIST).head, \ (ARRAY_LIST).len, \ (ARRAY_LIST).allocated + \ - ((ARRAY_LIST).alloc_blocksize ? : 8), \ + ((ARRAY_LIST).allocated ? \ + (ARRAY_LIST).allocated / 2 : \ + (ARRAY_LIST).alloc_blocksize ? : 8), \ sizeof(*(ARRAY_LIST).head)); \ if ((ARRAY_LIST).p == NULL) { \ NEW_ITEM_P = NULL; \ break; \ } \ (ARRAY_LIST).allocated += \ - (ARRAY_LIST).alloc_blocksize ? : 8; \ + (ARRAY_LIST).allocated ? \ + (ARRAY_LIST).allocated / 2 : \ + (ARRAY_LIST).alloc_blocksize ? : 8, \ (ARRAY_LIST).head = (ARRAY_LIST).p; \ (ARRAY_LIST).p = NULL; \ }; \ -- Mark Jamsek <fnc.bsdbox.org> GPG: F2FF 13DE 6A06 C471 CA80 E6E2 2930 DC66 86EE CF68
tog: keymaps to navigate to prev/next file/hunk in the diff