Download raw body.
diff vs CRLF
On Mon, Jul 25, 2022 at 03:34:29PM +0200, Omar Polo wrote: > `got diff' has some issues when dealing with CRLF files. I stumbled on > this when looking at the history of > > https://github.com/fcorbelli/zpaqfranz > > for example: > > % got log -l1 -p -c 90eafd8 | head -n30 > ... > --- zpaqfranz.cpp > +++ zpaqfranz.cpp > @@ -1,87483 +1,45670 @@ > -///////// EXPERIMENTAL BUILD > \ No newline at end of file > +///////// EXPERIMENTAL BUILD > ... > +///////// https://sourceforge.net/projects/zpaqfranz/ > > -///////// The source is a mess > \ No newline at end of file > +///////// I had to reluctantly move parts of the comments ... > > the "\ Now newline at end of file" is spammed very frequently across the > diff. > > i tracked this down to diff_data_atomize_text_lines_{fd,mmap}: there's > an error in accounting for \r\n. it's easy to reproduce with a smaller > file: > > % printf 'test\r\n' > crlf > % got add crlf && got ci -m +crlf > % printf 'test 2\r\n' > crlf > % got di > ... > @@ -1,2 +1,2 @@ > -test > +test 2 > > % > > notice the empty line after 'test 2' and the wrong line offsets in the > hunk header, should be '@@ -1 +1 @@'. > > (not that it matters that much; `got patch', Larry' patch(1) and GNU' > patch don't work on CRLF files anyway, this is just aesthetic for the > reader) > > patch below fixes the accounting in `got di' so it doesn't produce these > strange results on CRLF-delimited files. Should apply in diff.git too. > > ok? > > P.S.: it's just me or the regress of diff.git is broken? i can't get it > to run cleanly after a fresh checkout. got regress is happy though. Yes, regress in diff.git is a bit broken. > P.P.S: we care enough about CRLF to add some regress? I don't see why not. It's always good to cover additional edge cases in tests. Your patch makes sense, ok stsp@ > diff /home/op/w/got > commit - 10a16316a03005a07c45b2bbf1b5644b64e846fb > path + /home/op/w/got > blob - 0531fabe30530664069034f1c3166fe9d18a6e54 > file + lib/diff_atomize_text.c > --- lib/diff_atomize_text.c > +++ lib/diff_atomize_text.c > @@ -92,7 +92,7 @@ diff_data_atomize_text_lines_fd(struct diff_data *d) > r = fread(buf, sizeof(char), sizeof(buf), d->root->f); > if (r == 0 && ferror(d->root->f)) > return errno; > - if (r == 1 && buf[0] == '\n' ) > + if (r > 0 && buf[0] == '\n') > line_end++; > } > > @@ -151,11 +151,9 @@ diff_data_atomize_text_lines_mmap(struct diff_data *d) > > /* When not at the end of data, the line ending char ('\r' or > * '\n') must follow */ > - if (line_end < end) > + if (line_end < end && *line_end == '\r') > line_end++; > - /* If that was an '\r', also pull in any following '\n' */ > - if (line_end < end - 1 && line_end[0] == '\r' && > - line_end[1] == '\n') > + if (line_end < end && *line_end == '\n') > line_end++; > > /* Record the found line as diff atom */ > >
diff vs CRLF