Download raw body.
expose chunk offsets in the diff API
On Mon, Feb 20, 2023 at 12:18:00AM +1100, Mark Jamsek wrote:
> On 23-02-20 12:05AM, Mark Jamsek wrote:
> > As discussed with stsp, if we expose each chunk's offset, we can more
> > efficiently seek to where we need to start parsing lines rather than
> > skipping each line till we get there. The below diff makes this change.
>
> I completely forgot to explain that atom->pos is indeed set whether
> mmap'd or not--it is just that atom->at is _only_ set when mmap'd.
> This is done in the atomize routines in diff_atomize_text.c.
> Perhaps the comment in the diff_main.h header could be more precise?
Indeed. The code is good but the comment is misleading. We should
fix the comment in diff.git and eventually sync over the change.
> > This also requires tightening the check for chunks with only added lines
> > so we don't grab the offset to unchanged lines (i.e., right_count ==
> > left_count). On that note, should we also expose the diff_chunk_type()
> > routine?
I would say yes. Having this in the public diff APi will be useful as
we expose more of the internals to improve caller efficiency.
> > Regress is obviously unaffected by this change.
> > - while (ln < cc.right.start) {
> > - err = skip_one_line(ondisk_file);
The pre-declaration for skip_one_line() which was added in the original
diff could be deleted again.
ok for the diff.
expose chunk offsets in the diff API