Download raw body.
diff vs CRLF
Omar Polo <op@omarpolo.com> wrote: > `got diff' has some issues when dealing with CRLF files. I stumbled on > this when looking at the history of > > https://github.com/fcorbelli/zpaqfranz > > for example: > > % got log -l1 -p -c 90eafd8 | head -n30 > ... > --- zpaqfranz.cpp > +++ zpaqfranz.cpp > @@ -1,87483 +1,45670 @@ > -///////// EXPERIMENTAL BUILD > \ No newline at end of file > +///////// EXPERIMENTAL BUILD > ... > +///////// https://sourceforge.net/projects/zpaqfranz/ > > -///////// The source is a mess > \ No newline at end of file > +///////// I had to reluctantly move parts of the comments ... > > the "\ Now newline at end of file" is spammed very frequently across the > diff. > > i tracked this down to diff_data_atomize_text_lines_{fd,mmap}: there's > an error in accounting for \r\n. it's easy to reproduce with a smaller > file: > > % printf 'test\r\n' > crlf > % got add crlf && got ci -m +crlf > % printf 'test 2\r\n' > crlf > % got di > ... > @@ -1,2 +1,2 @@ > -test > +test 2 > > % > > notice the empty line after 'test 2' and the wrong line offsets in the > hunk header, should be '@@ -1 +1 @@'. I failed to explain how this error becomes the "\ No newline" previously mentioned. The issue is that we fail to correctly compute the end of the line in CRLF files. The code is written to support them, but in reality it stops at the \r. later, diff_output_trailing_newline_msg sees that the character at the end of the line is not a line feed and assumes that the original file was lacking a final newline. > (not that it matters that much; `got patch', Larry' patch(1) and GNU' > patch don't work on CRLF files anyway, this is just aesthetic for the > reader) > > patch below fixes the accounting in `got di' so it doesn't produce these > strange results on CRLF-delimited files. Should apply in diff.git too. > > ok? > > P.S.: it's just me or the regress of diff.git is broken? i can't get it > to run cleanly after a fresh checkout. got regress is happy though. > > P.P.S: we care enough about CRLF to add some regress? > > diff /home/op/w/got > commit - 10a16316a03005a07c45b2bbf1b5644b64e846fb > path + /home/op/w/got > blob - 0531fabe30530664069034f1c3166fe9d18a6e54 > file + lib/diff_atomize_text.c > --- lib/diff_atomize_text.c > +++ lib/diff_atomize_text.c > @@ -92,7 +92,7 @@ diff_data_atomize_text_lines_fd(struct diff_data *d) > r = fread(buf, sizeof(char), sizeof(buf), d->root->f); > if (r == 0 && ferror(d->root->f)) > return errno; > - if (r == 1 && buf[0] == '\n' ) > + if (r > 0 && buf[0] == '\n') > line_end++; > } > > @@ -151,11 +151,9 @@ diff_data_atomize_text_lines_mmap(struct diff_data *d) > > /* When not at the end of data, the line ending char ('\r' or > * '\n') must follow */ > - if (line_end < end) > + if (line_end < end && *line_end == '\r') > line_end++; > - /* If that was an '\r', also pull in any following '\n' */ > - if (line_end < end - 1 && line_end[0] == '\r' && > - line_end[1] == '\n') > + if (line_end < end && *line_end == '\n') > line_end++; > > /* Record the found line as diff atom */
diff vs CRLF