From: Mark Jamsek Subject: Re: got rebase or diff library bug To: Mark Jamsek Cc: Christian Weisgerber , gameoftrees@openbsd.org Date: Mon, 05 Aug 2024 01:53:22 +1000 Mark Jamsek wrote: > Christian Weisgerber wrote: > > Weird things happen when you commit a change that adds a line without > > terminating '\n' at the end of a file, and then rebase or histedit > > that change. > > > > $ cd src # OpenBSD src checkout > > $ got br -c master dummy > > Switching work tree from refs/heads/master to refs/heads/dummy > > $ tar xzvf ~/llvm18.tar.gz '*/GCNRegPressure.cpp' > > gnu/llvm/llvm/lib/Target/AMDGPU/GCNRegPressure.cpp > > $ hexdump -C gnu/llvm/llvm/lib/Target/AMDGPU/GCNRegPressure.cpp | tail -3 > > 000053d0 20 66 61 6c 73 65 3b 0a 0a 23 75 6e 64 65 66 20 | false;..#undef | > > 000053e0 50 46 58 0a 7d |PFX.}| > > 000053e5 > > $ got ci -m whatever > > M gnu/llvm/llvm/lib/Target/AMDGPU/GCNRegPressure.cpp > > Created commit 28d598d2212490682514aca54d70befb8e86e176 > > $ got up -c master > > U gnu/llvm/llvm/lib/Target/AMDGPU/GCNRegPressure.cpp > > Updated to refs/heads/dummy: 141c93e2a20bfc66591c104e112b9ad964e23e22 > > $ got he > > [... pick commit...] > > $ hexdump -C gnu/llvm/llvm/lib/Target/AMDGPU/GCNRegPressure.cpp | tail -3 > > 000053e0 6e 20 66 61 6c 73 65 3b 0a 0a 23 75 6e 64 65 66 |n false;..#undef| > > 000053f0 20 50 46 58 0a 7d 2e 0a 34 37 31 63 0a | PFX.}..471c.| > > 000053fd > > > > In this case, ".\n471c\n" is reproducibly appended. Some sort of > > internal leakage from the diff library? > > > > > > I just tried a simpler case in a dummy repository with a file "hello" > > > > hello > > world > > > > then "echo -n fubar >>hello", and after commit and histedit... the > > file is empty. > > > > Commit diff before, from the histedit backup: > > > > --- hello > > +++ hello > > @@ -1,2 +1,3 @@ > > hello > > world > > +fubar > > \ No newline at end of file > > > > Commit diff after histedit: > > > > --- hello > > +++ hello > > @@ -1,2 +0,0 @@ > > -hello > > -world > > > > Ugh. > > This looks to be caused by how we are currently mishandling malformed > text files (i.e., that aren't terminated with a newline) when generating > our ed script for the 3-way merge. We don't propagate the error either > so the failure goes unreported. > > I don't think we want to do what ed(1) does and add the terminating > newline character to text files while leaving binary files unchanged; > we shouldn't be modifying tracked files at all. > > The below diff is minimally invasive and very simple. We ensure a proper > ed script is generated in the case of diffs involving a malformed line > but when writing the merged file, we elide the added terminating newline. Updated diff checks the buffer for the terminating newline--not the line returned from getline(3). This fixes an OOB read in the previous diff. I still want to write a test for this case though. diffstat f997e889eb0517a24a28f2f4000d6915683fef86 b5ac15ce9816489710c3698d89d6a7ef21b4d2fe M lib/diff3.c | 12+ 1- 1 file changed, 12 insertions(+), 1 deletion(-) diff f997e889eb0517a24a28f2f4000d6915683fef86 b5ac15ce9816489710c3698d89d6a7ef21b4d2fe commit - f997e889eb0517a24a28f2f4000d6915683fef86 commit + b5ac15ce9816489710c3698d89d6a7ef21b4d2fe blob - c1ca2fdf44d461cda6adb950263ed6a9775971ac blob + cc7cc1ee97127908122a804cd7aa3e75540389df --- lib/diff3.c +++ lib/diff3.c @@ -155,6 +155,8 @@ struct diff3_state { char *buf; + int no_eofnl; /* set if the merged file has no eof newline */ + BUF *diffbuf; }; @@ -417,6 +419,9 @@ out: err = got_error_from_errno("fclose"); } if (err == NULL && diffb) { + if (d3s->no_eofnl == 1 && + diffb->cb_buf[diffb->cb_len - 1] == '\n') + --diffb->cb_len; if (buf_write_fd(diffb, outfd) < 0) err = got_error_from_errno("buf_write_fd"); *overlapcnt = d3s->overlapcnt; @@ -1063,7 +1068,13 @@ edscript(int n, struct diff3_state *d3s) } if (!d3s->overlap[n]) { - err = diff_output(d3s->diffbuf, ".\n"); + size_t len; + + len = buf_len(d3s->diffbuf); + d3s->no_eofnl = len > 0 && + buf_getc(d3s->diffbuf, len - 1) != '\n'; + err = diff_output(d3s->diffbuf, "%s.\n", + d3s->no_eofnl == 1 ? "\n" : ""); if (err) goto done; } else { -- Mark Jamsek GPG: F2FF 13DE 6A06 C471 CA80 E6E2 2930 DC66 86EE CF68