Discussion:
UTF-8, bad for discussions?
Foster Schucker
2018-12-02 00:04:58 UTC
Permalink
This may be just my problem, but, there is a series of posts that start
off in a code format that gets translated to hex characters. When it
shows up in the digest it's like one big block of text. So I miss out
on 1/3 of the posts and the nuances in them. Is there a way to fix
this? A Lua script I can run on my mail to fix this?

Thanks!
Jay Carlson
2018-12-02 00:07:24 UTC
Permalink
This may be just my problem, but, there is a series of posts that start off in a code format that gets translated to hex characters. When it shows up in the digest it's like one big block of text. So I miss out on 1/3 of the posts and the nuances in them. Is there a way to fix this? A Lua script I can run on my mail to fix this?
What do you use to read mail? (Follow-on question: Has it been updated in fifteen years? :-)
Philippe Verdy
2018-12-02 00:49:28 UTC
Permalink
Looks like this is a problem of your email agent (or any third-party
service that you chose to use, which uses outdated non-conforming
softwares, that blindly removes the encoding and treats it as if it was
ASCII or an old 8-bit charset).

UTF-8 is now **THE** standard of the web, it has won the vast majority of
uses and should be supported everywhere, so much that all new web standards
(or any revision of them) MUST include its support (this policy has been
adopted by the IETF for all new RFCs, and it is part of the essential BCP
standard track).

I you have old software that breaks on UTF-8 and still incorrectly retags
it blindly as if it was 7-bit ASCII or some ISO8859-* based charset (after
silently dropping the encoding that was explicitly encoded in MIME), this
software must be updated because it will break with a now vast majority of
posts and it has been several decenials that this should have been fixed
(even before UTF-8 was standardized when MIME was fixed long before to
allow clean identification of charsets and support 8-bit clean transport,
using well established transport syntaxes that should have been respected.

I think for example you use some antique software like old versions of MS
Outlook or Lotus Notes or other old enterprise software for private use but
not maintained at all since long (with no more any form or maintenance:
reserve these software only to get access to your own archives, but not to
follow new contents posted on the web, and think about adding some correct
conversion interface to isolate these archived data and legacy softwares
and make them compatible: don't use them to store any new data if the
conversion to the old format is lossy: this interface should be used in
only one direction, from old to new, but not at all in the reverse
direction). You may also choose to converty your old archives to make them
conforming (you are not required to use UTF-8, just choose an encoding that
is Unicode compliant).

Yes this means some initial cost (for creating the adapter), but there's
really no cost (and in fact your save a lot of costs, in money or time, by
adopting new softwares instead oif trying to patch more or less correctly
an antique software solution that will remain lossy in all cases).
Post by Foster Schucker
This may be just my problem, but, there is a series of posts that start
off in a code format that gets translated to hex characters. When it
shows up in the digest it's like one big block of text. So I miss out
on 1/3 of the posts and the nuances in them. Is there a way to fix
this? A Lua script I can run on my mail to fix this?
Thanks!
Rena
2018-12-02 02:41:17 UTC
Permalink
Post by Foster Schucker
This may be just my problem, but, there is a series of posts that start
off in a code format that gets translated to hex characters. When it
shows up in the digest it's like one big block of text. So I miss out
on 1/3 of the posts and the nuances in them. Is there a way to fix
this? A Lua script I can run on my mail to fix this?
Thanks!
From what I've heard, this is an issue with HTML mail rather than UTF-8.
Andrew Gierth
2018-12-02 02:42:20 UTC
Permalink
Foster> This may be just my problem, but, there is a series of posts
Foster> that start off in a code format that gets translated to hex
Foster> characters. When it shows up in the digest it's like one big
Foster> block of text. So I miss out on 1/3 of the posts and the
Foster> nuances in them. Is there a way to fix this? A Lua script I can
Foster> run on my mail to fix this?

I believe this is a long-standing bug in the list digest software - the
obvious fix is not to subscribe in digest mode.
--
Andrew.
Loading...