Download raw body.
tog: fix display of lines ending in \r\n
On Sun, Dec 13, 2020 at 01:35:03AM +0100, Christian Weisgerber wrote: > Stefan Sperling: > > > I'd say we can go back to using a custom single-char replacement hack. > > I.e. replace the control char with '?' or something like that, and > > whitelist the ones known to work (e.g. Tab). That is still better > > whan what we have right now. > > It's part of a bigger problem that printability and width also vary > by locale. E.g., let's say I have the byte sequence 0xc3 0xb6 in > a commit message (U+00F6 in UTF-8). On FreeBSD, I get different > display results if I run tog in LC_CTYPE=C.UTF-8 versus LC_CTYPE=C. > The latter is misformatted, even in a non-UTF8 xterm (!?). Not sure what you are seeing exactly, but it doesn't seem surprising that xterm does weird things in the C locale. Recall what Ingo uncovered here: https://undeadly.org/cgi?action=article&sid=20160308204011 "Printing sanitized UTF-8 to a US-ASCII terminal is *NOT* safe." (Needless to say, Ingo's article contains _all_ the details ;) I guess tog could run everything through iswprint() and hope that any characters flagged as printable in locale data will also have correct width information. Locale definitions can be buggy. Though I see that in 2018 (r340491) FreeBSD switched to using Unicode tables as source data for the UTF-8 locale, replacing the old buggy UTF-8 data, as OpenBSD did in 2015. And FreeBSD's DefaultRuneLocale table defines 0x7e (~) as the highest printable character for the C locale, as far as I can tell. So results from iswprint() should be correct in either case.
tog: fix display of lines ending in \r\n