From: Mark Jamsek Subject: Re: merge chokes, creates bogus conflicts To: Christian Weisgerber , gameoftrees@openbsd.org Date: Sun, 19 Feb 2023 21:31:18 +1100 On 23-02-19 10:17AM, Stefan Sperling wrote: > On Sun, Feb 19, 2023 at 06:38:41PM +1100, Mark Jamsek wrote: > > On 23-02-18 04:45PM, Stefan Sperling wrote: > > > On Sun, Feb 19, 2023 at 01:57:23AM +1100, Mark Jamsek wrote: > > - if (blob) { > > - err = got_object_blob_dump_to_file(&sz1, NULL, NULL, > > - f1, blob); > > + diff_chunk_context_load_change(&cc, &nchunks_parsed, r, n, 0); > > + > > + while (ln < cc.right.start) { > > + err = skip_one_line(ondisk_file); > > It is a bit disappointing that we need to keep reading lines from the > file to skip to the start of a hunk, instead of being able to fseek to a > line via a byte offset which corresponds to line number cc.right.start. > For very large files (e.g. in OpenBSD sys/pci/drm) this can be expensive. > > Maybe the diff library does not yet expose enough information? > diff_atoms do have a 'pos' member but it is apparently only filled if the > file is not memory mapped. Should add an API that exposes byte offsets of > all chunks? Or does something like this already exist? I did look to see if we could do that [seek to the first hunk with added lines] and came to the same conclusion; that is, no, we can't do it (unless GOT_DIFF_NO_MMAP is defined) so I think we should add hunk offsets to the result irrespective of whether the files are mmap'd. But I didn't want to mix such changes in with this diff, and then when I discovered the couple bugs I just wanted to push this out quickly, so I instead added it to my todo list with the other diff API change (i.e., reuse diff_result). -- Mark Jamsek GPG: F2FF 13DE 6A06 C471 CA80 E6E2 2930 DC66 86EE CF68