Discussion:
Test Latin-1 in Google Groups.
(too old to reply)
Ruud Harmsen
2018-11-02 17:05:49 UTC
Permalink
Testing accented letters, encoded in Latin-1, ISO8859-1, Windows
1252/CP1252

Accent grave: àèìòù
Accent aigu: áéíóúý
Accent circonflex: âêôîû
Diaeresis: äëïöüÿ
Tilde: ãõ
Cedilla: ç

Uppercase:
Accent grave: ÀÈÌÒÙ
Accent aigu: ÁÉÍÓÚÝ
Accent circonflex: ÂÊÔÎÛ
Diaeresis: ÄËÏÖܟ
Tilde: ÃÕ
Cedilla: Ç



àèìòùáéíóúýâêôîûäëïöüÿãõç
--
Ruud Harmsen, http://rudhar.com
Ruud Harmsen
2018-11-02 17:30:45 UTC
Permalink
Post by Ruud Harmsen
Testing accented letters, encoded in Latin-1, ISO8859-1, Windows
1252/CP1252
Accent grave: àèìòù
Accent aigu: áéíóúý
Accent circonflex: âêôîû
Diaeresis: äëïöüÿ
Tilde: ãõ
Cedilla: ç
Accent grave: ÀÈÌÒÙ
Accent aigu: ÁÉÍÓÚÝ
Accent circonflex: ÂÊÔÎÛ
Diaeresis: ÄËÏÖܟ
Tilde: ÃÕ
Cedilla: Ç
àèìòùáéíóúýâêôîûäëïöüÿãõç
Same problem here. The message arrives in my Usenet program intact, it
is in GG intact, as a raw message. But the normal GG display changes
it to Cyrillic.

Compare:
https://groups.google.com/d/msg/sci.lang/xx6jGN64dB0/yiYGYa2UAwAJ
(Cyrillic) with:
https://groups.google.com/forum/#!original/sci.lang/xx6jGN64dB0/yiYGYa2UAwAJ
--
Ruud Harmsen, http://rudhar.com
Christian Weisgerber
2018-11-02 17:40:44 UTC
Permalink
Post by Ruud Harmsen
Testing accented letters, encoded in Latin-1, ISO8859-1, Windows
1252/CP1252
Looking at the raw article in my INN server's newsspool...
Post by Ruud Harmsen
Diaeresis: äëïöüÿ
Diaeresis: ÄËÏÖÜ<9F>
There was an illegal character there, byte 0x9F, which is not a
printable character in ISO 8859-1. Also, capital Y with diaeresis
is not part of the 8859-1 set. (Lowercase y with diaeresis is.)
--
Christian "naddy" Weisgerber ***@mips.inka.de
Ruud Harmsen
2018-11-02 18:53:49 UTC
Permalink
Fri, 2 Nov 2018 17:40:44 -0000 (UTC): Christian Weisgerber
Post by Christian Weisgerber
Post by Ruud Harmsen
Testing accented letters, encoded in Latin-1, ISO8859-1, Windows
1252/CP1252
Looking at the raw article in my INN server's newsspool...
Post by Ruud Harmsen
Diaeresis: äëïöüÿ
Diaeresis: ÄËÏÖÜ<9F>
There was an illegal character there, byte 0x9F, which is not a
printable character in ISO 8859-1. Also, capital Y with diaeresis
is not part of the 8859-1 set. (Lowercase y with diaeresis is.)
Bingo, that must be it! Thanks.

So I am lying (and Agent 1.93 lets me!), I claim to post in 8859-1 but
I don't, because I also post outside that range. That confuses Google:
it doesn't know how to interpret that character. (So it switches to
Cyrillic, which is a weird thing to do, in my opinion; even if I
myself caused it.)

OK, so now I'll repeat the test, but with those funny dotted y's
(which nobody ever needs anyway) removed.
--
Ruud Harmsen, http://rudhar.com
Peter T. Daniels
2018-11-02 19:15:30 UTC
Permalink
Post by Ruud Harmsen
Fri, 2 Nov 2018 17:40:44 -0000 (UTC): Christian Weisgerber
Post by Christian Weisgerber
Post by Ruud Harmsen
Testing accented letters, encoded in Latin-1, ISO8859-1, Windows
1252/CP1252
Looking at the raw article in my INN server's newsspool...
Post by Ruud Harmsen
Diaeresis: длпцья
Diaeresis: ДЛПЦЬ<9F>
There was an illegal character there, byte 0x9F, which is not a
printable character in ISO 8859-1. Also, capital Y with diaeresis
is not part of the 8859-1 set. (Lowercase y with diaeresis is.)
But here, after it passed through Ruud, I see cyrillic again.
Post by Ruud Harmsen
Bingo, that must be it! Thanks.
So I am lying (and Agent 1.93 lets me!), I claim to post in 8859-1 but
it doesn't know how to interpret that character. (So it switches to
Cyrillic, which is a weird thing to do, in my opinion; even if I
myself caused it.)
OK, so now I'll repeat the test, but with those funny dotted y's
(which nobody ever needs anyway) removed.
Nobody? You can't discuss French decadence without it! or Belgian violinists!
António Marques
2018-11-02 19:20:25 UTC
Permalink
Post by Ruud Harmsen
Fri, 2 Nov 2018 17:40:44 -0000 (UTC): Christian Weisgerber
Post by Christian Weisgerber
Post by Ruud Harmsen
Testing accented letters, encoded in Latin-1, ISO8859-1, Windows
1252/CP1252
Looking at the raw article in my INN server's newsspool...
Post by Ruud Harmsen
Diaeresis: äëïöüÿ
Diaeresis: ÄËÏÖÜ<9F>
There was an illegal character there, byte 0x9F, which is not a
printable character in ISO 8859-1. Also, capital Y with diaeresis
is not part of the 8859-1 set. (Lowercase y with diaeresis is.)
Bingo, that must be it! Thanks.
So I am lying (and Agent 1.93 lets me!), I claim to post in 8859-1 but
it doesn't know how to interpret that character. (So it switches to
Cyrillic, which is a weird thing to do, in my opinion; even if I
myself caused it.)
OK, so now I'll repeat the test, but with those funny dotted y's
(which nobody ever needs anyway) removed.
What your program is using is probably Windows-1252 or something like that
- it has the same code points as Latin-1 but has additional characters
where Latin-1 has a reserved area for control codes.
Do you have the option to send Windows-1252? Maybe then GG will accept it
for what it is.
Ruud Harmsen
2018-11-02 20:23:21 UTC
Permalink
Fri, 2 Nov 2018 19:20:25 -0000 (UTC): António Marques
Post by António Marques
Post by Ruud Harmsen
Fri, 2 Nov 2018 17:40:44 -0000 (UTC): Christian Weisgerber
Post by Christian Weisgerber
Post by Ruud Harmsen
Testing accented letters, encoded in Latin-1, ISO8859-1, Windows
1252/CP1252
Looking at the raw article in my INN server's newsspool...
Post by Ruud Harmsen
Diaeresis: äëïöüÿ
Diaeresis: ÄËÏÖÜ<9F>
There was an illegal character there, byte 0x9F, which is not a
printable character in ISO 8859-1. Also, capital Y with diaeresis
is not part of the 8859-1 set. (Lowercase y with diaeresis is.)
Bingo, that must be it! Thanks.
So I am lying (and Agent 1.93 lets me!), I claim to post in 8859-1 but
it doesn't know how to interpret that character. (So it switches to
Cyrillic, which is a weird thing to do, in my opinion; even if I
myself caused it.)
OK, so now I'll repeat the test, but with those funny dotted y's
(which nobody ever needs anyway) removed.
What your program is using is probably Windows-1252 or something like that
- it has the same code points as Latin-1 but has additional characters
where Latin-1 has a reserved area for control codes.
Do you have the option to send Windows-1252? Maybe then GG will accept it
for what it is.
Yes, the Code Page is Windows-1252, but it says "Post Usenet as"
ISO-8859-1. Can't change that. Agent 1.93.

Anyway, if as an interface designer you are confronted with a message
claiming to be in ISO-8859-1, but actually it contains CP1252 (code
point 9F is an umlauted Y, in CP1252 but not 8859-1), what do you do?

1) Assume Windows Cp1252? (Makes sense, right?)

2) Assume Cyrillic, https://en.wikipedia.org/wiki/ISO/IEC_8859-5,
which ALSO has no defined character for 9F?
(Wouldn't that be, to put it mildly, SLIGHTLY silly?)

Well, GG does 2). What's more, it also does 2) when the posted message
IS fully in ISO-8859-1, as later tests reveal.

GG is broken, and never was whole.
--
Ruud Harmsen, http://rudhar.com
António Marques
2018-11-02 22:29:02 UTC
Permalink
Post by Ruud Harmsen
Fri, 2 Nov 2018 19:20:25 -0000 (UTC): António Marques
Post by António Marques
Post by Ruud Harmsen
Fri, 2 Nov 2018 17:40:44 -0000 (UTC): Christian Weisgerber
Post by Christian Weisgerber
Post by Ruud Harmsen
Testing accented letters, encoded in Latin-1, ISO8859-1, Windows
1252/CP1252
Looking at the raw article in my INN server's newsspool...
Post by Ruud Harmsen
Diaeresis: äëïöüÿ
Diaeresis: ÄËÏÖÜ<9F>
There was an illegal character there, byte 0x9F, which is not a
printable character in ISO 8859-1. Also, capital Y with diaeresis
is not part of the 8859-1 set. (Lowercase y with diaeresis is.)
Bingo, that must be it! Thanks.
So I am lying (and Agent 1.93 lets me!), I claim to post in 8859-1 but
it doesn't know how to interpret that character. (So it switches to
Cyrillic, which is a weird thing to do, in my opinion; even if I
myself caused it.)
OK, so now I'll repeat the test, but with those funny dotted y's
(which nobody ever needs anyway) removed.
What your program is using is probably Windows-1252 or something like that
- it has the same code points as Latin-1 but has additional characters
where Latin-1 has a reserved area for control codes.
Do you have the option to send Windows-1252? Maybe then GG will accept it
for what it is.
Yes, the Code Page is Windows-1252, but it says "Post Usenet as"
ISO-8859-1. Can't change that. Agent 1.93.
Anyway, if as an interface designer you are confronted with a message
claiming to be in ISO-8859-1, but actually it contains CP1252 (code
point 9F is an umlauted Y, in CP1252 but not 8859-1), what do you do?
1) Assume Windows Cp1252? (Makes sense, right?)
No. How does it know it’s 1252? There’s nothing special about 9F. A lot of
encodings use it for characters. Either it trusts the charset header, or
anything goes.
They’re only bytes.
Post by Ruud Harmsen
2) Assume Cyrillic, https://en.wikipedia.org/wiki/ISO/IEC_8859-5,
which ALSO has no defined character for 9F?
(Wouldn't that be, to put it mildly, SLIGHTLY silly?)
Well, GG does 2). What's more, it also does 2) when the posted message
IS fully in ISO-8859-1, as later tests reveal.
But do they? My single a-grave went through ok.

You don’t know that it’s assuming a specific Cyrillic encoding. If it’s not
able to trust the header, it’s reasonable to try to be useful to the
largest amount of people. Who tells you there was no Cyrillic encoding in
widespread use whose agents often misidentified as Latin-1?
Ruud Harmsen
2018-11-03 06:01:29 UTC
Permalink
Fri, 2 Nov 2018 22:29:02 -0000 (UTC): António Marques
Post by Ruud Harmsen
Anyway, if as an interface designer you are confronted with a message
claiming to be in ISO-8859-1, but actually it contains CP1252 (code
point 9F is an umlauted Y, in CP1252 but not 8859-1), what do you do?
1) Assume Windows Cp1252? (Makes sense, right?)
No. How does it know it’s 1252?
All characters within 8859-1, only some without, but those are within
1252. So guess what?
There’s nothing special about 9F.
Yes there is, it's in that (in)famous range that is defined for 1252
but not for 8859-1.
A lot of
encodings use it for characters. Either it trusts the charset header, or
anything goes.
They’re only bytes.
And people having sent them.
Post by Ruud Harmsen
2) Assume Cyrillic, https://en.wikipedia.org/wiki/ISO/IEC_8859-5,
which ALSO has no defined character for 9F?
(Wouldn't that be, to put it mildly, SLIGHTLY silly?)
Well, GG does 2). What's more, it also does 2) when the posted message
IS fully in ISO-8859-1, as later tests reveal.
But do they? My single a-grave went through ok.
Yes, that remain the riddle. My later conforming messages on the other
hand were also mangled.
You don’t know that it’s assuming a specific Cyrillic encoding. If it’s not
able to trust the header, it’s reasonable to try to be useful to the
largest amount of people. Who tells you there was no Cyrillic encoding in
widespread use whose agents often misidentified as Latin-1?
http://czyborra.com/charsets/cyrillic.html
"Windows-1251" which is not to be mistaken as a 13th century precursor
of today's Windows95®"

Nice.
António Marques
2018-11-03 13:08:44 UTC
Permalink
Post by Ruud Harmsen
Fri, 2 Nov 2018 22:29:02 -0000 (UTC): António Marques
Post by Ruud Harmsen
Anyway, if as an interface designer you are confronted with a message
claiming to be in ISO-8859-1, but actually it contains CP1252 (code
point 9F is an umlauted Y, in CP1252 but not 8859-1), what do you do?
1) Assume Windows Cp1252? (Makes sense, right?)
No. How does it know it’s 1252?
All characters within 8859-1, only some without, but those are within
1252. So guess what?
They are as well within a number of other charsets.

What you wanted was ‘treat Latin-1 as if it were w1252’. While that would
solve this case, it would wreak havoc with a lot of others.
Post by Ruud Harmsen
There’s nothing special about 9F.
Yes there is, it's in that (in)famous range that is defined for 1252
but not for 8859-1.
It’s also defined for most other charsets. The various ISO-Latin are pretty
much alone In leaving those precious 32 bytes to control codes nobody uses.
Post by Ruud Harmsen
A lot of
encodings use it for characters. Either it trusts the charset header, or
anything goes.
They’re only bytes.
And people having sent them.
Post by Ruud Harmsen
2) Assume Cyrillic, https://en.wikipedia.org/wiki/ISO/IEC_8859-5,
which ALSO has no defined character for 9F?
(Wouldn't that be, to put it mildly, SLIGHTLY silly?)
Well, GG does 2). What's more, it also does 2) when the posted message
IS fully in ISO-8859-1, as later tests reveal.
But do they? My single a-grave went through ok.
Yes, that remain the riddle. My later conforming messages on the other
hand were also mangled.
You don’t know that it’s assuming a specific Cyrillic encoding. If it’s not
able to trust the header, it’s reasonable to try to be useful to the
largest amount of people. Who tells you there was no Cyrillic encoding in
widespread use whose agents often misidentified as Latin-1?
http://czyborra.com/charsets/cyrillic.html
"Windows-1251" which is not to be mistaken as a 13th century precursor
of today's Windows95®"
‘Today’s’? Are you telling us Windows 95’s development and architecture
don’t date from the 1200s? (The century marked by intelligent decisions,
such as having a ‘Christian’ army attack the centre of Eastern
Christianity.)

It’s probably doing heuristics. It’s maybe the only way to display most of
the old content correctly, short of having an option to choose display
encoding for each message. That it breaks with compliant messages - if it
really does - is unfortunate, but then nobody is supposed to be using 8-bit
single byte encodings these days. That’s almost dirty.
Christian Weisgerber
2018-11-03 16:30:11 UTC
Permalink
Post by António Marques
Post by Ruud Harmsen
Yes there is, it's in that (in)famous range that is defined for 1252
but not for 8859-1.
It’s also defined for most other charsets. The various ISO-Latin are pretty
much alone In leaving those precious 32 bytes to control codes nobody uses.
You have to consider the time and context when these standards were
created. ISO 8859-1, from the mid-1980s, is a slightly modified
version of the DEC Multinational Character Set that had been
introduced in 1983 with the highly influential DEC VT220 terminal.
The VT220 and its successors could be configured to use 8-bit control
characters (0x80..0x9F). For instance, 0x9B could be used instead
of the sequence 0x1B 0x5B. This optimized the transmission from
host to terminal over the slow EIA-232 serial connection (typically
9600 bit/s). Clearly, 8-bit control codes were the future.

The last time I ran into 8-bit control codes was on the Remote
Management Console of an AlphaServer 800 (introduced in 1997), which
was hard-coded to produce such terminal control output. It was at
that point that I noticed that 8-bit control codes are fundamentally
incompatible with UTF-8.
--
Christian "naddy" Weisgerber ***@mips.inka.de
António Marques
2018-11-12 15:36:59 UTC
Permalink
Post by Christian Weisgerber
Post by António Marques
Post by Ruud Harmsen
Yes there is, it's in that (in)famous range that is defined for 1252
but not for 8859-1.
It’s also defined for most other charsets. The various ISO-Latin are pretty
much alone In leaving those precious 32 bytes to control codes nobody uses.
You have to consider the time and context when these standards were
created. ISO 8859-1, from the mid-1980s, is a slightly modified
version of the DEC Multinational Character Set that had been
introduced in 1983 with the highly influential DEC VT220 terminal.
The VT220 and its successors could be configured to use 8-bit control
characters (0x80..0x9F). For instance, 0x9B could be used instead
of the sequence 0x1B 0x5B. This optimized the transmission from
host to terminal over the slow EIA-232 serial connection (typically
9600 bit/s). Clearly, 8-bit control codes were the future.
The last time I ran into 8-bit control codes was on the Remote
Management Console of an AlphaServer 800 (introduced in 1997), which
was hard-coded to produce such terminal control output. It was at
that point that I noticed that 8-bit control codes are fundamentally
incompatible with UTF-8.
I had no idea the ISO charsets were so old. I thought they dated from the
late 90s.
Christian Weisgerber
2018-11-12 17:48:26 UTC
Permalink
Post by António Marques
I had no idea the ISO charsets were so old. I thought they dated from the
late 90s.
ISO 8859-1 was already the default on the Commodore Amiga.
--
Christian "naddy" Weisgerber ***@mips.inka.de
Athel Cornish-Bowden
2018-11-17 18:50:57 UTC
Permalink
On 2018-11-12 17:48:26 +0000, Christian Weisgerber said:


[ … ]


This has nothing to do with this thread (so I'm not quoting anything),
but I have just visited your web pages and see that you are based in
Ludwigshafen, so you may be able to clear up a mystery that has been
with me for many years. In the 1960s I had a road atlas of western
Europe produced by, I think, Kümmerly-Frey. Places the editors
considered to be interesting had their names underlined in green, but
it was obvious that the criteria used to decide which places qualified
obviouly varied wildly from country to country. The Netherlands, for
example, didn't have any interesting places, apart, perhaps, from
Amsterdam. West Germany, on the other hand, had so many interesting
places that I had trouble finding any city of any size that was not
interesting: the only one I found was Ludwigshafen. I've been wondering
ever since what made Ludwigshafen uniquely uninteresting. I don't think
I've ever set foot in Ludwigshafen, though I've seen it from across the
river in Mannheim.
--
athel
Ruud Harmsen
2018-11-03 23:09:01 UTC
Permalink
Sat, 3 Nov 2018 13:08:44 -0000 (UTC): António Marques
Post by António Marques
It’s probably doing heuristics. It’s maybe the only way to display most of
the old content correctly, short of having an option to choose display
encoding for each message. That it breaks with compliant messages - if it
really does - is unfortunate, but then nobody is supposed to be using 8-bit
single byte encodings these days. That’s almost dirty.
Nonsense. GG is buggy and does stupid things, period. Admit it.
Ruud Harmsen
2018-11-03 23:06:47 UTC
Permalink
Sat, 3 Nov 2018 13:08:44 -0000 (UTC): António Marques
Post by António Marques
Post by Ruud Harmsen
http://czyborra.com/charsets/cyrillic.html
"Windows-1251" which is not to be mistaken as a 13th century precursor
of today's Windows95®"
‘Today’s’? Are you telling us Windows 95’s development and architecture
don’t date from the 1200s?
You missed the quotation marks.
_I_ am not telling you that, Roman Czyborra did, some 20 years ago.
António Marques
2018-11-03 23:43:48 UTC
Permalink
Post by Ruud Harmsen
Sat, 3 Nov 2018 13:08:44 -0000 (UTC): António Marques
Post by António Marques
Post by Ruud Harmsen
http://czyborra.com/charsets/cyrillic.html
"Windows-1251" which is not to be mistaken as a 13th century precursor
of today's Windows95®"
‘Today’s’? Are you telling us Windows 95’s development and architecture
don’t date from the 1200s?
You missed the quotation marks.
_I_ am not telling you that, Roman Czyborra did, some 20 years ago.
So what?
Ruud Harmsen
2018-11-04 06:49:13 UTC
Permalink
Sat, 3 Nov 2018 23:43:48 -0000 (UTC): António Marques
Post by António Marques
Post by Ruud Harmsen
Sat, 3 Nov 2018 13:08:44 -0000 (UTC): António Marques
Post by António Marques
Post by Ruud Harmsen
http://czyborra.com/charsets/cyrillic.html
"Windows-1251" which is not to be mistaken as a 13th century precursor
of today's Windows95®"
‘Today’s’? Are you telling us Windows 95’s development and architecture
don’t date from the 1200s?
You missed the quotation marks.
_I_ am not telling you that, Roman Czyborra did, some 20 years ago.
So what?
So that. It's funny. Or I find it funny.
--
Ruud Harmsen, http://rudhar.com
Ruud Harmsen
2018-11-03 23:08:10 UTC
Permalink
Sat, 3 Nov 2018 13:08:44 -0000 (UTC): António Marques
Post by António Marques
It’s probably doing heuristics. It’s maybe the only way to display most of
the old content correctly, short of having an option to choose display
encoding for each message.
It's doing heuristics also for correctly marked message that are
completely in the encoding in the header. So yes, GG is non-compliant
and buggy.
António Marques
2018-11-03 23:20:48 UTC
Permalink
Post by Ruud Harmsen
Sat, 3 Nov 2018 13:08:44 -0000 (UTC): António Marques
Post by António Marques
It’s probably doing heuristics. It’s maybe the only way to display most of
the old content correctly, short of having an option to choose display
encoding for each message.
It's doing heuristics also for correctly marked message that are
completely in the encoding in the header. So yes, GG is non-compliant
and buggy.
How can you _know_ the message is correctly marked? How can you know the
message is ‘in’ the encoding of the header? The encoding is what declares
the meaning of the bytes. If you don’t trust the header, there’s nothing
else you can resort to other than heuristics.
And old clients were known for saying one thing and doing another. Yours,
apparently, is one of those.
Ruud Harmsen
2018-11-04 06:59:55 UTC
Permalink
Sat, 3 Nov 2018 23:20:48 -0000 (UTC): António Marques
Post by António Marques
Post by Ruud Harmsen
It's doing heuristics also for correctly marked message that are
completely in the encoding in the header. So yes, GG is non-compliant
and buggy.
How can you _know_ the message is correctly marked? How can you know the
message is ‘in’ the encoding of the header?
I know what I posted in which test messages. I can see the header by
pressing H in Agent or by looking at the "Original message" in GG. The
header, sent by Agent, says the message is in ISO-8859-1. And it is:
everything in that message is valid in ISO-8859-1.
Post by António Marques
The encoding is what declares
the meaning of the bytes. If you don’t trust the header, there’s nothing
else you can resort to other than heuristics.
There is no reason for GG not to trust that header. The header says
ISO-8859-1 and all the characters in the message are meaningful in
that encoding. So what GG should do is displaying it as what the
message says it is.

There is NO reason whatsever to display parts of it as Russian or
Hebrew. Doing that anyway is simply smartass and buggy behaviour. It
is arrogantly applying Artificial Intelligence where none is needed,
because there is an unambiguous and correct specification of how the
text is enoded.
Post by António Marques
And old clients were known for saying one thing and doing another. Yours,
apparently, is one of those.
No. Not in those several test message that did not contain Y umlaut.
--
Ruud Harmsen, http://rudhar.com
Ruud Harmsen
2018-11-04 07:42:03 UTC
Permalink
Post by Ruud Harmsen
There is NO reason whatsever to display parts of it as Russian or
Hebrew. Doing that anyway is simply smartass and buggy behaviour. It
is arrogantly applying Artificial Intelligence where none is needed,
because there is an unambiguous and correct specification of how the
text is enoded.
We could also have analysed this technical phenomemon in concerted
action in a peaceful and friendly manner. But no, _every_ occasion is
taken, in several places in Usenet, to create conflicts where non need
to exist.
--
Ruud Harmsen, http://rudhar.com
Rein
2018-11-11 14:43:02 UTC
Permalink
Post by Ruud Harmsen
Post by Ruud Harmsen
There is NO reason whatsever to display parts of it as Russian or
Hebrew. Doing that anyway is simply smartass and buggy behaviour. It
is arrogantly applying Artificial Intelligence where none is needed,
because there is an unambiguous and correct specification of how the
text is enoded.
We could also have analysed this technical phenomemon in concerted
action in a peaceful and friendly manner. But no, _every_ occasion is
taken, in several places in Usenet, to create conflicts where non need
to exist.
Kijk die krokodillentranen. Ik zie niemand ontmenselijkt worden,
hypocriet.
--
<
Ruud Harmsen
2018-11-11 22:04:10 UTC
Permalink
Post by Rein
Post by Ruud Harmsen
Post by Ruud Harmsen
There is NO reason whatsever to display parts of it as Russian or
Hebrew. Doing that anyway is simply smartass and buggy behaviour. It
is arrogantly applying Artificial Intelligence where none is needed,
because there is an unambiguous and correct specification of how the
text is enoded.
We could also have analysed this technical phenomemon in concerted
action in a peaceful and friendly manner. But no, _every_ occasion is
taken, in several places in Usenet, to create conflicts where non need
to exist.
Kijk die krokodillentranen. Ik zie niemand ontmenselijkt worden,
hypocriet.
You are writing in Dutch in an English language news group. Please
stop that behaviour, it is offensive.
--
Ruud Harmsen, http://rudhar.com
António Marques
2018-11-12 14:46:32 UTC
Permalink
Post by Ruud Harmsen
(...)
You are writing in Dutch in an English language news group.
Is it tho? And being, should it be? I don’t think so.

On a separate note, I’ve read somewhere that your blessed Eudora was once
widely used by Russians, sending KOI8 or whatever identified as ISO-8859-1.
That would explain the use of heuristics by GG rather than trusting the
headers - which is preferable, making a lot of Russians able to
communicate, or letting one guy say ‘áéíóú’?

Have you tried sending valid ISO-8859-1 from a compliant agent such as TB?
Maybe the heuristics only kicks in for old clients.
Ruud Harmsen
2018-11-13 06:09:56 UTC
Permalink
Mon, 12 Nov 2018 14:46:32 -0000 (UTC): António Marques
Post by António Marques
On a separate note, I’ve read somewhere that your blessed Eudora was once
widely used by Russians, sending KOI8 or whatever identified as ISO-8859-1.
That would explain the use of heuristics by GG rather than trusting the
headers - which is preferable, making a lot of Russians able to
communicate, or letting one guy say ‘áéíóú’?
I don't use Eudora for Usenet and it cannot be used for that. It is an
e-mail program.
Post by António Marques
Have you tried sending valid ISO-8859-1 from a compliant agent such as TB?
Free Agent is also compliant.
Post by António Marques
Maybe the heuristics only kicks in for old clients.
--
Ruud Harmsen, http://rudhar.com
António Marques
2018-11-13 14:04:51 UTC
Permalink
Post by Ruud Harmsen
Mon, 12 Nov 2018 14:46:32 -0000 (UTC): António Marques
Post by António Marques
On a separate note, I’ve read somewhere that your blessed Eudora was once
widely used by Russians, sending KOI8 or whatever identified as ISO-8859-1.
That would explain the use of heuristics by GG rather than trusting the
headers - which is preferable, making a lot of Russians able to
communicate, or letting one guy say ‘áéíóú’?
I don't use Eudora for Usenet and it cannot be used for that. It is an
e-mail program.
Eudora, Forte, whatever. It’s all the same unsupported buggy stuff from a
bygone era of ill repute. The internets say your FA used to send KOI8
identified as 8859-1.
Post by Ruud Harmsen
Post by António Marques
Have you tried sending valid ISO-8859-1 from a compliant agent such as TB?
Free Agent is also compliant.
God forbid you should test 8859-1 with TB and it turned out OK in GG.
Ruud Harmsen
2018-11-13 18:15:15 UTC
Permalink
Tue, 13 Nov 2018 14:04:51 -0000 (UTC): António Marques
Post by António Marques
Post by Ruud Harmsen
Mon, 12 Nov 2018 14:46:32 -0000 (UTC): António Marques
Post by António Marques
On a separate note, I’ve read somewhere that your blessed Eudora was once
widely used by Russians, sending KOI8 or whatever identified as ISO-8859-1.
That would explain the use of heuristics by GG rather than trusting the
headers - which is preferable, making a lot of Russians able to
communicate, or letting one guy say ‘áéíóú’?
I don't use Eudora for Usenet and it cannot be used for that. It is an
e-mail program.
Eudora, Forte, whatever. It’s all the same unsupported buggy stuff from a
bygone era of ill repute. The internets say your FA used to send KOI8
identified as 8859-1.
Post by Ruud Harmsen
Post by António Marques
Have you tried sending valid ISO-8859-1 from a compliant agent such as TB?
Free Agent is also compliant.
God forbid you should test 8859-1 with TB and it turned out OK in GG.
1) You can test it yourself. Did you?

2) It's too much of a nuisance, I don't have TB installed on this
computer.

3) FreeAgent 1.93 can also speak, and always post in UTF8. I didn't
know it. However, that doesn't I can post the true names of the late
Mr. Kashoggi, because the screen interface still only supports
Windows1252. And that is quite enough.
--
Ruud Harmsen, http://rudhar.com
António Marques
2018-11-13 18:49:13 UTC
Permalink
Post by Ruud Harmsen
Tue, 13 Nov 2018 14:04:51 -0000 (UTC): António Marques
Post by António Marques
Post by Ruud Harmsen
Mon, 12 Nov 2018 14:46:32 -0000 (UTC): António Marques
Post by António Marques
On a separate note, I’ve read somewhere that your blessed Eudora was once
widely used by Russians, sending KOI8 or whatever identified as ISO-8859-1.
That would explain the use of heuristics by GG rather than trusting the
headers - which is preferable, making a lot of Russians able to
communicate, or letting one guy say ‘áéíóú’?
I don't use Eudora for Usenet and it cannot be used for that. It is an
e-mail program.
Eudora, Forte, whatever. It’s all the same unsupported buggy stuff from a
bygone era of ill repute. The internets say your FA used to send KOI8
identified as 8859-1.
Post by Ruud Harmsen
Post by António Marques
Have you tried sending valid ISO-8859-1 from a compliant agent such as TB?
Free Agent is also compliant.
God forbid you should test 8859-1 with TB and it turned out OK in GG.
1) You can test it yourself. Did you?
No, I don't have an unsupported, limited newsreader that would prompt me to
look for alternatives. Maybe later in the week when I have my laptop with
me I can check it out.
Post by Ruud Harmsen
2) It's too much of a nuisance, I don't have TB installed on this
computer.
So you're also on the 'oh the horror that is installing software' boat? It
gets better and betterer, as Franz would say.
Post by Ruud Harmsen
3) FreeAgent 1.93 can also speak, and always post in UTF8. I didn't
know it. However, that doesn't I can post the true names of the late
Mr. Kashoggi, because the screen interface still only supports
Windows1252.
What the UI supports should have no bearing on what encoding you send the
messages with.

How were all those people sending Russian with it if it can't do Cyrillic?
Ruud Harmsen
2018-11-13 20:44:52 UTC
Permalink
Tue, 13 Nov 2018 18:49:13 -0000 (UTC): António Marques
Post by António Marques
Post by Ruud Harmsen
2) It's too much of a nuisance, I don't have TB installed on this
computer.
So you're also on the 'oh the horror that is installing software' boat? It
gets better and betterer, as Franz would say.
I'm turrenctly installing software all the time. Using pkg on FreeBSD
11.2.
Post by António Marques
Post by Ruud Harmsen
3) FreeAgent 1.93 can also speak, and always post in UTF8. I didn't
know it. However, that doesn't I can post the true names of the late
Mr. Kashoggi, because the screen interface still only supports
Windows1252.
What the UI supports should have no bearing on what encoding you send the
messages with.
That's what I'm saying.
Post by António Marques
How were all those people sending Russian with it if it can't do Cyrillic?
Gnomes in your fantasy?

http://webcenter.ru/~kazarn/rus/agent_w2k.htm

Ah, you're right, Agent does have settings for Russian! But not at the
same time, and I never used that.
--
Ruud Harmsen, http://rudhar.com
Ruud Harmsen
2018-11-13 20:48:33 UTC
Permalink
Tue, 13 Nov 2018 18:49:13 -0000 (UTC): Ant?nio Marques
Post by António Marques
How were all those people sending Russian with it if it can't do Cyrillic?
Ok, so now I have Agent set to use Cyrillic in KOI8!!! How does it
look?

АИМСЗ
ЮХЛРЫ
БЙНТШ
ЦУ
ъ
▒ ▓
Ъ
--
Ruud Harmsen, http://rudhar.com
Ruud Harmsen
2018-11-13 20:52:33 UTC
Permalink
Post by Ruud Harmsen
Tue, 13 Nov 2018 18:49:13 -0000 (UTC): Ant?nio Marques
Post by António Marques
How were all those people sending Russian with it if it can't do Cyrillic?
Ok, so now I have Agent set to use Cyrillic in KOI8!!! How does it
look?
АИМСЗ
ЮХЛРЫ
БЙНТШ
ЦУ
ъ
▒ ▓
Ъ
ISO8859-1, as I receive it.
--
Ruud Harmsen, http://rudhar.com
Ruud Harmsen
2018-11-13 20:55:54 UTC
Permalink
Post by Ruud Harmsen
Tue, 13 Nov 2018 18:49:13 -0000 (UTC): Ant?nio Marques
Post by António Marques
How were all those people sending Russian with it if it can't do Cyrillic?
Ok, so now I have Agent set to use Cyrillic in KOI8!!! How does it
look?
АИМСЗ
ЮХЛРЫ
БЙНТШ
ЦУ
ъ
▒ ▓
Ъ
--
Ruud Harmsen, http://rudhar.com
As Cyrillic, in Google Groups. But note that the original message now contains:
Content-Type: text/plain; charset=KOI8-R
Ruud Harmsen
2018-11-13 20:58:13 UTC
Permalink
Post by Ruud Harmsen
Tue, 13 Nov 2018 18:49:13 -0000 (UTC): Ant?nio Marques
Post by António Marques
How were all those people sending Russian with it if it can't do Cyrillic?
Ok, so now I have Agent set to use Cyrillic in KOI8!!! How does it
look?
And Hebrew:
ביםףת
אטלעש
גךמפ
Ruud Harmsen
2018-11-13 21:03:39 UTC
Permalink
Post by Ruud Harmsen
Tue, 13 Nov 2018 18:49:13 -0000 (UTC): Ant?nio Marques
Post by António Marques
How were all those people sending Russian with it if it can't do Cyrillic?
Ok, so now I have Agent set to use Cyrillic in KOI8!!! How does it
look?
ביםףת
אטלעש
גךמפ
But sent as:
Content-Type: text/plain; charset=utf-8
Ruud Harmsen
2018-11-11 22:01:45 UTC
Permalink
Post by Rein
Post by Ruud Harmsen
Post by Ruud Harmsen
There is NO reason whatsever to display parts of it as Russian or
Hebrew. Doing that anyway is simply smartass and buggy behaviour. It
is arrogantly applying Artificial Intelligence where none is needed,
because there is an unambiguous and correct specification of how the
text is enoded.
We could also have analysed this technical phenomemon in concerted
action in a peaceful and friendly manner. But no, _every_ occasion is
taken, in several places in Usenet, to create conflicts where non need
to exist.
Kijk die krokodillentranen. Ik zie niemand ontmenselijkt worden,
hypocriet.
Lees je wel eens mee in sci.lang? Ik wel, al decennia.

Zo nee, dan moet je je mond houden, je weet niet waar je over praat.

De kritiek die ik daar uit geldt trouwens net zo goed voor 30 jaar
nl.taal, ja. En voor jou persoonlijk. Jij doet daar niet anders dan
alles proberen om te buigen tot persoonlijke conflicten, wat het ook
is.

En nogmaals, Address, als je er niet van verdacht wil worden een bot
te zijn, moet je niet gedragen als een bot. Simpel.
--
Ruud Harmsen, http://rudhar.com
Rein
2018-11-12 11:57:15 UTC
Permalink
Post by Ruud Harmsen
Post by Rein
Post by Ruud Harmsen
Post by Ruud Harmsen
There is NO reason whatsever to display parts of it as Russian or
Hebrew. Doing that anyway is simply smartass and buggy behaviour. It
is arrogantly applying Artificial Intelligence where none is needed,
because there is an unambiguous and correct specification of how the
text is enoded.
We could also have analysed this technical phenomemon in concerted
action in a peaceful and friendly manner. But no, _every_ occasion is
taken, in several places in Usenet, to create conflicts where non need
to exist.
Kijk die krokodillentranen. Ik zie niemand ontmenselijkt worden,
hypocriet.
Lees je wel eens mee in sci.lang? Ik wel, al decennia.
Zo nee, dan moet je je mond houden, je weet niet waar je over praat.
De kritiek die ik daar uit geldt trouwens net zo goed voor 30 jaar
nl.taal, ja. En voor jou persoonlijk. Jij doet daar niet anders dan
alles proberen om te buigen tot persoonlijke conflicten, wat het ook
is.
En nogmaals, Address, als je er niet van verdacht wil worden een bot
te zijn, moet je niet gedragen als een bot. Simpel.
"You are writing in Dutch in an English language news group.
Please stop that behaviour, it is offensive." (RH)
--
<
Peter T. Daniels
2018-11-04 14:37:00 UTC
Permalink
Post by Ruud Harmsen
Sat, 3 Nov 2018 23:20:48 -0000 (UTC): António Marques
Post by António Marques
Post by Ruud Harmsen
It's doing heuristics also for correctly marked message that are
completely in the encoding in the header. So yes, GG is non-compliant
and buggy.
How can you _know_ the message is correctly marked? How can you know the
message is ‘in’ the encoding of the header?
I know what I posted in which test messages. I can see the header by
pressing H in Agent or by looking at the "Original message" in GG. The
everything in that message is valid in ISO-8859-1.
Post by António Marques
The encoding is what declares
the meaning of the bytes. If you don’t trust the header, there’s nothing
else you can resort to other than heuristics.
There is no reason for GG not to trust that header. The header says
ISO-8859-1 and all the characters in the message are meaningful in
that encoding. So what GG should do is displaying it as what the
message says it is.
There is NO reason whatsever to display parts of it as Russian or
Hebrew. Doing that anyway is simply smartass and buggy behaviour. It
is arrogantly applying Artificial Intelligence where none is needed,
because there is an unambiguous and correct specification of how the
text is enoded.
Post by António Marques
And old clients were known for saying one thing and doing another. Yours,
apparently, is one of those.
No. Not in those several test message that did not contain Y umlaut.
Why is it _only_ messages from Ruud Harmsen in which that particular fault
occurs? In occasional other messages, GG replaces characters with question
marks, or rarely it interprets pairs of characters as Chinese characters,
and very rarely rectangles or diamond-question-marks.

Never before have I seen characters turn into cyrillic or Hebrew.
Ruud Harmsen
2018-11-04 15:05:14 UTC
Permalink
Sun, 4 Nov 2018 06:37:00 -0800 (PST): "Peter T. Daniels"
Post by Peter T. Daniels
Post by Ruud Harmsen
Post by António Marques
And old clients were known for saying one thing and doing another. Yours,
apparently, is one of those.
No. Not in those several test message that did not contain Y umlaut.
Why is it _only_ messages from Ruud Harmsen in which that particular fault
occurs?
1) I don't know. I fed back the error to Google, perhaps they'll
investigate the error and respond.

2) Until now nobody posted a string of accented letters that does not
look like a word in any language, in ISO-8859-1 and marked as such.
Post by Peter T. Daniels
In occasional other messages, GG replaces characters with question
marks, or rarely it interprets pairs of characters as Chinese characters,
and very rarely rectangles or diamond-question-marks.
Never before have I seen characters turn into cyrillic or Hebrew.
Neither did I. Apparently the introduction of the GG bug is recent.
--
Ruud Harmsen, http://rudhar.com
António Marques
2018-11-12 15:35:04 UTC
Permalink
Post by Ruud Harmsen
The header says
ISO-8859-1 and all the characters in the message are meaningful in
that encoding
Once more: all the 256 bytes are meaningful in all the 8-bit encodings (bar
some irrelevant old one). That’s not something that can be generally used
to validate them. An encoding is a declaration and not, per se, subject to
validation. It’s what it is. Even in the ISO charsets, the control codes
aren’t any less valid. They’re just not printable characters, or characters
at all, if you prefer.
Now, whether the bytes translate to a meaningful human message when using a
given charset, is another matter, and it’s a matter for AI, not classical
algorithms.

As to why not trust the header, I’ve explained in the other thread.
O. Udeman
2018-11-12 16:04:35 UTC
Permalink
Post by António Marques
Post by Ruud Harmsen
The header says
ISO-8859-1 and all the characters in the message are meaningful in
that encoding
Once more: all the 256 bytes are meaningful in all the 8-bit encodings (bar
some irrelevant old one). That’s not something that can be generally used
to validate them. An encoding is a declaration and not, per se, subject to
validation. It’s what it is. Even in the ISO charsets, the control codes
aren’t any less valid. They’re just not printable characters, or characters
at all, if you prefer.
Now, whether the bytes translate to a meaningful human message when using a
given charset, is another matter, and it’s a matter for AI, not classical
algorithms.
As to why not trust the header, I’ve explained in the other thread.
Bla, bla, bla.
Ruud Harmsen
2018-11-13 06:15:55 UTC
Permalink
Mon, 12 Nov 2018 15:35:04 -0000 (UTC): António Marques
Post by António Marques
Now, whether the bytes translate to a meaningful human message when using a
given charset, is another matter, and it’s a matter for AI, not classical
algorithms.
Interpreting announced ISO-8859-1 as ISO-8859-1 is a classical
algorithm.

Anyway, the point is, the Russian and Hebrew was displayed by GG, not
sent by me, something of which I have been falsely accused at length.
--
Ruud Harmsen, http://rudhar.com
António Marques
2018-11-13 14:04:52 UTC
Permalink
Post by Ruud Harmsen
Mon, 12 Nov 2018 15:35:04 -0000 (UTC): António Marques
Post by António Marques
Now, whether the bytes translate to a meaningful human message when using a
given charset, is another matter, and it’s a matter for AI, not classical
algorithms.
Interpreting announced ISO-8859-1 as ISO-8859-1 is a classical
algorithm.
That’s not an algorithm and it won’t work reliably given the sheer amount
of content that was produced by buggy clients such as yours back in the
day.
Post by Ruud Harmsen
Anyway, the point is, the Russian and Hebrew was displayed by GG, not
sent by me, something of which I have been falsely accused at length.
Not by me. I’ve only accused you of doing the unethical thing of using
buggy, unsupported software to communicate with others.
Ruud Harmsen
2018-11-13 18:18:40 UTC
Permalink
Tue, 13 Nov 2018 14:04:52 -0000 (UTC): António Marques
Post by António Marques
Post by Ruud Harmsen
Mon, 12 Nov 2018 15:35:04 -0000 (UTC): António Marques
Post by António Marques
Now, whether the bytes translate to a meaningful human message when using a
given charset, is another matter, and it’s a matter for AI, not classical
algorithms.
Interpreting announced ISO-8859-1 as ISO-8859-1 is a classical
algorithm.
That’s not an algorithm and it won’t work reliably given the sheer amount
of content that was produced by buggy clients such as yours back in the
day.
Agent 1.93 isn't buggy, except that it sometimes sends Windows 1252
and headers ISO-8859-1. A very minor offence.

GG however is buggy, and user unfriendly, and bad at searching text
(which was GG strong point par excellence?!?!?)

And your smartphone does smart curly quotes without you even knowing
about it.
Post by António Marques
Post by Ruud Harmsen
Anyway, the point is, the Russian and Hebrew was displayed by GG, not
sent by me, something of which I have been falsely accused at length.
Not by me. I’ve only accused you of doing the unethical thing of using
buggy, unsupported software to communicate with others.
Unjustfied. Software need not be supported, standards are. Agent 1.93
is compliant.
--
Ruud Harmsen, http://rudhar.com
António Marques
2018-11-13 18:49:14 UTC
Permalink
Post by Ruud Harmsen
GG however is buggy, and user unfriendly, and bad at searching text
(which was GG strong point par excellence?!?!?)
There comes a point where the canned complaints about some piece of
software say more about the complainer than about the complainee. I'm
reminded of a number of poor software developers who would blame the faults
in their code on 'IE6'. Now, IE6 had many faults, but it was very rarely
the guilty party in their case.


But both IE6 and your attitude hark back to a time before Apple swept the
scene - a time where you could say program X was this or that. For quite
some time now it has been the case that software actually changes. One day
program X has some issues, the next day a new version has come out that has
solved most of them and looks like a distant relative, but works more or
less the same in regard to what it did right. Not all can pull it - Google
is still garbage at designing software, but Microsoft on the other hand has
done an impressive job. (Not all is good: see what they did to 'text
boundaries', for instance, it's a showcase of stupidity.)
Post by Ruud Harmsen
And your smartphone does smart curly quotes without you even knowing
about it.
I still feel embarrassed about that.
Ruud Harmsen
2018-11-13 20:24:36 UTC
Permalink
Tue, 13 Nov 2018 18:49:14 -0000 (UTC): António Marques
Post by António Marques
Post by Ruud Harmsen
And your smartphone does smart curly quotes without you even knowing
about it.
I still feel embarrassed about that.
Why? Such things just happen. No problem at all.
--
Ruud Harmsen, http://rudhar.com
Ruud Harmsen
2018-11-13 20:24:05 UTC
Permalink
Tue, 13 Nov 2018 18:49:14 -0000 (UTC): António Marques
Post by António Marques
but Microsoft on the other hand has
done an impressive job.
You mean in replacing Word 6/97 (which worked fine) with that chaotic
and incomprensible ribbon interface? And by replaces Windows 7 with
version 8, which was a failed attempt to make a computer work like a
smartphone, forgetting that it just isn't?
Post by António Marques
(Not all is good: see what they did to 'text
boundaries', for instance, it's a showcase of stupidity.)
No idea what you mean by that.
--
Ruud Harmsen, http://rudhar.com
Ruud Harmsen
2018-11-04 07:44:10 UTC
Permalink
Sat, 3 Nov 2018 23:20:48 -0000 (UTC): António Marques
Post by António Marques
And old clients were known for saying one thing and doing another. Yours,
apparently, is one of those.
There has been a habit, or so I hear, of sending Russian _without_ any
indication of the encoding. But not of indicating Latin-1 and then
really sending Russian.
--
Ruud Harmsen, http://rudhar.com
Peter T. Daniels
2018-11-02 19:13:23 UTC
Permalink
Post by Christian Weisgerber
Post by Ruud Harmsen
Testing accented letters, encoded in Latin-1, ISO8859-1, Windows
1252/CP1252
Looking at the raw article in my INN server's newsspool...
Post by Ruud Harmsen
Diaeresis: äëïöüÿ
Diaeresis: ÄËÏÖÜ<9F>
In Ruud's first message above I again see cyrillic characters. But in
the two lines above, I see letters with dieresis. Is some sort of
control code interfering?
Post by Christian Weisgerber
There was an illegal character there, byte 0x9F, which is not a
printable character in ISO 8859-1. Also, capital Y with diaeresis
is not part of the 8859-1 set. (Lowercase y with diaeresis is.)
I see the capital Y as <9F> (lessthan nine EFF greaterthan)
peteolcott
2018-11-12 14:58:08 UTC
Permalink
Post by Ruud Harmsen
Testing accented letters, encoded in Latin-1, ISO8859-1, Windows
1252/CP1252
Accent grave: àèìòù
Accent aigu: áéíóúý
Accent circonflex: âêôîû
Diaeresis: äëïöüÿ
Tilde: ãõ
Cedilla: ç
Accent grave: ÀÈÌÒÙ
Accent aigu: ÁÉÍÓÚÝ
Accent circonflex: ÂÊÔÎÛ
Diaeresis: ÄËÏÖÜŸ
Tilde: ÃÕ
Cedilla: Ç
àèìòùáéíóúýâêôîûäëïöüÿãõç
Google groups does almost full UTF-8 Unicode depending on the
fonts stored on the users local machine. Even old smart phones
should be able to do Latin-1.
O. Udeman
2018-11-12 15:30:01 UTC
Permalink
Post by peteolcott
Post by Ruud Harmsen
Testing accented letters, encoded in Latin-1, ISO8859-1, Windows
1252/CP1252
Accent grave: àèìòù
Accent aigu: áéíóúý
Accent circonflex: âêôîû
Diaeresis: äëïöüÿ
Tilde: ãõ
Cedilla: ç
Accent grave: ÀÈÌÒÙ
Accent aigu: ÁÉÍÓÚÝ
Accent circonflex: ÂÊÔÎÛ
Diaeresis: ÄËÏÖÜŸ
Tilde: ÃÕ
Cedilla: Ç
àèìòùáéíóúýâêôîûäëïöüÿãõç
Google groups does almost full UTF-8 Unicode depending on the
fonts stored on the users local machine. Even old smart phones
should be able to do Latin-1.
Pleurt op stel idioten. .
Ruud Harmsen
2018-11-13 06:17:21 UTC
Permalink
Post by peteolcott
Post by Ruud Harmsen
Testing accented letters, encoded in Latin-1, ISO8859-1, Windows
1252/CP1252
Accent grave: àèìòù
Accent aigu: áéíóúý
Accent circonflex: âêôîû
Diaeresis: äëïöüÿ
Tilde: ãõ
Cedilla: ç
Accent grave: ÀÈÌÒÙ
Accent aigu: ÁÉÍÓÚÝ
Accent circonflex: ÂÊÔÎÛ
Diaeresis: ÄËÏÖÜŸ
Tilde: ÃÕ
Cedilla: Ç
àèìòùáéíóúýâêôîûäëïöüÿãõç
Google groups does almost full UTF-8 Unicode depending on the
fonts stored on the users local machine. Even old smart phones
should be able to do Latin-1.
The problem was not in the sender's computer, not in the reader's
computer or smartphone, but in Google Groups, as has been amply
investigated and proven.
--
Ruud Harmsen, http://rudhar.com
Peter T. Daniels
2018-11-13 14:30:47 UTC
Permalink
Post by Ruud Harmsen
Post by peteolcott
Post by Ruud Harmsen
Testing accented letters, encoded in Latin-1, ISO8859-1, Windows
1252/CP1252
Accent grave: àèìòù
Accent aigu: áéíóúý
Accent circonflex: âêôîû
Diaeresis: äëïöüÿ
Tilde: ãõ
Cedilla: ç
Accent grave: ÀÈÌÒÙ
Accent aigu: ÁÉÍÓÚÝ
Accent circonflex: ÂÊÔÎÛ
Diaeresis: ÄËÏÖÜŸ
Tilde: ÃÕ
Cedilla: Ç
àèìòùáéíóúýâêôîûäëïöüÿãõç
Google groups does almost full UTF-8 Unicode depending on the
fonts stored on the users local machine. Even old smart phones
should be able to do Latin-1.
The problem was not in the sender's computer, not in the reader's
computer or smartphone, but in Google Groups, as has been amply
investigated and proven.
Stop with your "phantasy," as you call it. I saw several lines of
perfectly clear cyrilllic letters in the initial message, and when
the complainer copied the lines, they were the question-marks that he saw.

Incidentally, all the accented letters above, including the meaningless
string of them, are fine, so you seem to have fixed whatever was wrong
with your system.
António Marques
2018-11-13 14:42:52 UTC
Permalink
Post by Peter T. Daniels
Post by Ruud Harmsen
Post by peteolcott
Post by Ruud Harmsen
Testing accented letters, encoded in Latin-1, ISO8859-1, Windows
1252/CP1252
Accent grave: àèìòù
Accent aigu: áéíóúý
Accent circonflex: âêôîû
Diaeresis: äëïöüÿ
Tilde: ãõ
Cedilla: ç
Accent grave: ÀÈÌÒÙ
Accent aigu: ÁÉÍÓÚÝ
Accent circonflex: ÂÊÔÎÛ
Diaeresis: ÄËÏÖÜŸ
Tilde: ÃÕ
Cedilla: Ç
àèìòùáéíóúýâêôîûäëïöüÿãõç
Google groups does almost full UTF-8 Unicode depending on the
fonts stored on the users local machine. Even old smart phones
should be able to do Latin-1.
The problem was not in the sender's computer, not in the reader's
computer or smartphone, but in Google Groups, as has been amply
investigated and proven.
Stop with your "phantasy," as you call it. I saw several lines of
perfectly clear cyrilllic letters in the initial message, and when
the complainer copied the lines, they were the question-marks that he saw.
Incidentally, all the accented letters above, including the meaningless
string of them, are fine, so you seem to have fixed whatever was wrong
with your system.
What was wrong with it was insisting on using technology that was already
old when Yeltsin was president.

It's been established that GG uses heuristics to determine a message's
charset rather than trusting the declared value (unless the message
declares UTF-8, the current standard). It may or may not use a whitelist or
blacklist of sender software when doing that.

That much is not Ruud's fault, but neither can it be called a bug in GG.
It's unexpected behaviour, which prima facie is antisocial, but may be
justified on further analysis.
Ruud Harmsen
2018-11-13 18:41:39 UTC
Permalink
Tue, 13 Nov 2018 14:42:52 -0000 (UTC): António Marques
Post by António Marques
Post by Peter T. Daniels
Post by Ruud Harmsen
Post by peteolcott
Post by Ruud Harmsen
Testing accented letters, encoded in Latin-1, ISO8859-1, Windows
1252/CP1252
Accent grave: àèìòù
Accent aigu: áéíóúý
Accent circonflex: âêôîû
Diaeresis: äëïöüÿ
Tilde: ãõ
Cedilla: ç
Accent grave: ÀÈÌÒÙ
Accent aigu: ÁÉÍÓÚÝ
Accent circonflex: ÂÊÔÎÛ
Diaeresis: ÄËÏÖÜŸ
Tilde: ÃÕ
Cedilla: Ç
àèìòùáéíóúýâêôîûäëïöüÿãõç
Google groups does almost full UTF-8 Unicode depending on the
fonts stored on the users local machine. Even old smart phones
should be able to do Latin-1.
The problem was not in the sender's computer, not in the reader's
computer or smartphone, but in Google Groups, as has been amply
investigated and proven.
Stop with your "phantasy," as you call it. I saw several lines of
perfectly clear cyrilllic letters in the initial message, and when
the complainer copied the lines, they were the question-marks that he saw.
Incidentally, all the accented letters above, including the meaningless
string of them, are fine, so you seem to have fixed whatever was wrong
with your system.
What was wrong with it was insisting on using technology that was already
old when Yeltsin was president.
Again:
1) Agent is and was compliant, except that it occasionally sent
Windows 1252 mislabelled as ISO-8859-1. A minor offence.

2) GG displayed its buggy behaviour also on test messages in which
Agent labelled ISO-8859-1 and also sent only that.
Post by António Marques
It's been established that GG uses heuristics to determine a message's
charset rather than trusting the declared value (unless the message
declares UTF-8, the current standard). It may or may not use a whitelist or
blacklist of sender software when doing that.
That much is not Ruud's fault, but neither can it be called a bug in GG.
In my opinion it's a bug. OK, or a failed heuristic.
Post by António Marques
It's unexpected behaviour, which prima facie is antisocial, but may be
justified on further analysis.
Who or what exactly are you calling antisocial?
--
Ruud Harmsen, http://rudhar.com
António Marques
2018-11-13 19:08:38 UTC
Permalink
Post by Ruud Harmsen
Tue, 13 Nov 2018 14:42:52 -0000 (UTC): António Marques
Post by António Marques
It's been established that GG uses heuristics to determine a message's
charset rather than trusting the declared value (unless the message
declares UTF-8, the current standard). It may or may not use a whitelist or
blacklist of sender software when doing that.
That much is not Ruud's fault, but neither can it be called a bug in GG.
In my opinion it's a bug.
You can't call something complex that was purposely designed a bug, unless
it's so malicious that it deserves it (such as a number of Chrome's
'features'). In this case there just seems to be a desire to render as many
messages as possible correctly. They _could_ have a toggle - heuristics vs
what the message says. But the people who decided there would be no toggle
are not the same ones that toiled away to provide the heuristics. Did you
stop to think that they have feelings?
Post by Ruud Harmsen
OK, or a failed heuristic.
AI can never get 100% of things right.
Ruud Harmsen
2018-11-13 18:38:21 UTC
Permalink
Tue, 13 Nov 2018 06:30:47 -0800 (PST): "Peter T. Daniels"
Post by Peter T. Daniels
Post by Ruud Harmsen
Post by peteolcott
Post by Ruud Harmsen
Testing accented letters, encoded in Latin-1, ISO8859-1, Windows
1252/CP1252
Accent grave: àèìòù
Accent aigu: áéíóúý
Accent circonflex: âêôîû
Diaeresis: äëïöüÿ
Tilde: ãõ
Cedilla: ç
Accent grave: ÀÈÌÒÙ
Accent aigu: ÁÉÍÓÚÝ
Accent circonflex: ÂÊÔÎÛ
Diaeresis: ÄËÏÖÜŸ
Tilde: ÃÕ
Cedilla: Ç
àèìòùáéíóúýâêôîûäëïöüÿãõç
Google groups does almost full UTF-8 Unicode depending on the
fonts stored on the users local machine. Even old smart phones
should be able to do Latin-1.
The problem was not in the sender's computer, not in the reader's
computer or smartphone, but in Google Groups, as has been amply
investigated and proven.
Stop with your "phantasy," as you call it. I saw several lines of
perfectly clear cyrilllic letters in the initial message, and when
the complainer copied the lines, they were the question-marks that he saw.
Sigh. You're technically incompetent, there is no longer any doubt
now. I gave you all the pointers but you are unable to look at them
and judge their meaning.

That's not so bad, not everybody needs to have the programming and
debugging experience that I have. But then please refrain from posting
unfunded accusations.
Post by Peter T. Daniels
Incidentally, all the accented letters above, including the meaningless
string of them, are fine, so you seem to have fixed whatever was wrong
with your system.
As announced, I now have Agent post in UTF-8, but you missed that too
or you didn't understand what it means.
--
Ruud Harmsen, http://rudhar.com
Peter T. Daniels
2018-11-13 20:05:06 UTC
Permalink
Post by Ruud Harmsen
Tue, 13 Nov 2018 06:30:47 -0800 (PST): "Peter T. Daniels"
Post by Peter T. Daniels
Post by Ruud Harmsen
Post by peteolcott
Post by Ruud Harmsen
Testing accented letters, encoded in Latin-1, ISO8859-1, Windows
1252/CP1252
Accent grave: àèìòù
Accent aigu: áéíóúý
Accent circonflex: âêôîû
Diaeresis: äëïöüÿ
Tilde: ãõ
Cedilla: ç
Accent grave: ÀÈÌÒÙ
Accent aigu: ÁÉÍÓÚÝ
Accent circonflex: ÂÊÔÎÛ
Diaeresis: ÄËÏÖÜŸ
Tilde: ÃÕ
Cedilla: Ç
àèìòùáéíóúýâêôîûäëïöüÿãõç
Google groups does almost full UTF-8 Unicode depending on the
fonts stored on the users local machine. Even old smart phones
should be able to do Latin-1.
The problem was not in the sender's computer, not in the reader's
computer or smartphone, but in Google Groups, as has been amply
investigated and proven.
Stop with your "phantasy," as you call it. I saw several lines of
perfectly clear cyrilllic letters in the initial message, and when
the complainer copied the lines, they were the question-marks that he saw.
Sigh. You're technically incompetent, there is no longer any doubt
now. I gave you all the pointers but you are unable to look at them
and judge their meaning.
That's not so bad, not everybody needs to have the programming and
debugging experience that I have. But then please refrain from posting
unfunded accusations.
Post by Peter T. Daniels
Incidentally, all the accented letters above, including the meaningless
string of them, are fine, so you seem to have fixed whatever was wrong
with your system.
As announced, I now have Agent post in UTF-8, but you missed that too
or you didn't understand what it means.
Or I don't see (your) messages in the exact order (you) posted (them).
Ruud Harmsen
2018-11-13 20:50:46 UTC
Permalink
Tue, 13 Nov 2018 12:05:06 -0800 (PST): "Peter T. Daniels"
Post by Peter T. Daniels
Post by Ruud Harmsen
As announced, I now have Agent post in UTF-8, but you missed that too
or you didn't understand what it means.
Or I don't see (your) messages in the exact order (you) posted (them).
That's because GG is so messy.
--
Ruud Harmsen, http://rudhar.com
Peter T. Daniels
2018-11-13 21:44:25 UTC
Permalink
Post by Ruud Harmsen
Tue, 13 Nov 2018 12:05:06 -0800 (PST): "Peter T. Daniels"
Post by Peter T. Daniels
Post by Ruud Harmsen
As announced, I now have Agent post in UTF-8, but you missed that too
or you didn't understand what it means.
Or I don't see (your) messages in the exact order (you) posted (them).
That's because GG is so messy.
?????????????????????????????????????????????????????????????????????????

I can look at any thread whenever I want, but it is my choice to look at
them in the order they're presented to me, which means the one that was
last added to longest ago is at the bottom of the "unread" list, so that's
where I start, and I go upward through the threads. If someone else posted
to a thread after you, then that thread is higher in the queue than it
would have been if they hadn't.
Ruud Harmsen
2018-11-13 22:00:58 UTC
Permalink
Tue, 13 Nov 2018 13:44:25 -0800 (PST): "Peter T. Daniels"
Post by Peter T. Daniels
Post by Ruud Harmsen
Tue, 13 Nov 2018 12:05:06 -0800 (PST): "Peter T. Daniels"
Post by Peter T. Daniels
Post by Ruud Harmsen
As announced, I now have Agent post in UTF-8, but you missed that too
or you didn't understand what it means.
Or I don't see (your) messages in the exact order (you) posted (them).
That's because GG is so messy.
?????????????????????????????????????????????????????????????????????????
I can look at any thread whenever I want,
GG doesn't show threads. Not for me, anyway. All messages are in a
long list without any structure.
Post by Peter T. Daniels
but it is my choice to look at
them in the order they're presented to me, which means the one that was
last added to longest ago is at the bottom of the "unread" list, so that's
where I start, and I go upward through the threads. If someone else posted
to a thread after you, then that thread is higher in the queue than it
would have been if they hadn't.
--
Ruud Harmsen, http://rudhar.com
Peter T. Daniels
2018-11-14 04:13:46 UTC
Permalink
Post by Ruud Harmsen
Tue, 13 Nov 2018 13:44:25 -0800 (PST): "Peter T. Daniels"
Post by Peter T. Daniels
Post by Ruud Harmsen
Tue, 13 Nov 2018 12:05:06 -0800 (PST): "Peter T. Daniels"
Post by Peter T. Daniels
Post by Ruud Harmsen
As announced, I now have Agent post in UTF-8, but you missed that too
or you didn't understand what it means.
Or I don't see (your) messages in the exact order (you) posted (them).
That's because GG is so messy.
?????????????????????????????????????????????????????????????????????????
I can look at any thread whenever I want,
GG doesn't show threads. Not for me, anyway. All messages are in a
long list without any structure.
You're crazy.
Post by Ruud Harmsen
Post by Peter T. Daniels
but it is my choice to look at
them in the order they're presented to me, which means the one that was
last added to longest ago is at the bottom of the "unread" list, so that's
where I start, and I go upward through the threads. If someone else posted
to a thread after you, then that thread is higher in the queue than it
would have been if they hadn't.
Ruud Harmsen
2018-11-14 08:33:26 UTC
Permalink
Tue, 13 Nov 2018 20:13:46 -0800 (PST): "Peter T. Daniels"
Post by Peter T. Daniels
Post by Ruud Harmsen
GG doesn't show threads. Not for me, anyway. All messages are in a
long list without any structure.
You're crazy.
It's what GG offers as a default, I never attempted to customize it.

A long list of messages, most of then "loading" because it is so slow,
no indication of which of those messages contains the text I was
looking for, and if it finally does show that one, the text appears as
a quote in it, with no trace of the original message by the original
poster. A total mess. Unusable.

Years ago, they had a much better one. But it had to go. They did the
same to Google Maps.
--
Ruud Harmsen, http://rudhar.com
Peter T. Daniels
2018-11-14 13:06:53 UTC
Permalink
Post by Ruud Harmsen
Tue, 13 Nov 2018 20:13:46 -0800 (PST): "Peter T. Daniels"
Post by Peter T. Daniels
Post by Ruud Harmsen
GG doesn't show threads. Not for me, anyway. All messages are in a
long list without any structure.
You're crazy.
It's what GG offers as a default, I never attempted to customize it.
Then Dutchia is different from America.

No matter which of the three "views" I choose within GG -- "chronological,"
"tree," or "paged" -- "paged" is the one I use, because to display the
other two takes an easily perceptible longer time -- the messages are only
presented _within each thread_.
Post by Ruud Harmsen
A long list of messages, most of then "loading" because it is so slow,
no indication of which of those messages contains the text I was
looking for, and if it finally does show that one, the text appears as
a quote in it, with no trace of the original message by the original
poster. A total mess. Unusable.
Once again, your problem is unique. Your computer seems to be very sick indeed.
Post by Ruud Harmsen
Years ago, they had a much better one. But it had to go. They did the
same to Google Maps.
"Years ago," GG offered only "tree view," in a panel at the left side, and
would only display 10 messages at a time, and there was no way to change
that. In "page view," the messages are chronological, 25 at a time, and
unless the thread approaches its maximum of 1000 messages, going from one
"page" to the previous or next one takes no appreciable time. (It can slow
when the number of messages passes 600 or so; after 1000, each reply to a
message in the extant 1000 starts a new thread with the same title.)
Ruud Harmsen
2018-11-14 15:50:11 UTC
Permalink
Wed, 14 Nov 2018 05:06:53 -0800 (PST): "Peter T. Daniels"
Post by Peter T. Daniels
Post by Ruud Harmsen
It's what GG offers as a default, I never attempted to customize it.
Then Dutchia is different from America.
No matter which of the three "views" I choose within GG -- "chronological,"
"tree," or "paged" -- "paged" is the one I use, because to display the
other two takes an easily perceptible longer time -- the messages are only
presented _within each thread_.
I see no choice between views. All that is in the menus are irrelevant
things that I don't need and don't understand.
Post by Peter T. Daniels
Post by Ruud Harmsen
A long list of messages, most of then "loading" because it is so slow,
no indication of which of those messages contains the text I was
looking for, and if it finally does show that one, the text appears as
a quote in it, with no trace of the original message by the original
poster. A total mess. Unusable.
Once again, your problem is unique. Your computer seems to be very sick indeed.
It has been like this, on many computers.
Post by Peter T. Daniels
Post by Ruud Harmsen
Years ago, they had a much better one. But it had to go. They did the
same to Google Maps.
"Years ago," GG offered only "tree view," in a panel at the left side, and
would only display 10 messages at a time, and there was no way to change
that. In "page view," the messages are chronological, 25 at a time, and
unless the thread approaches its maximum of 1000 messages, going from one
"page" to the previous or next one takes no appreciable time. (It can slow
when the number of messages passes 600 or so; after 1000, each reply to a
message in the extant 1000 starts a new thread with the same title.)
There are no different views, for me.

Perhaps you have GG Professional.
--
Ruud Harmsen, http://rudhar.com
António Marques
2018-11-14 21:32:08 UTC
Permalink
Post by Ruud Harmsen
Wed, 14 Nov 2018 05:06:53 -0800 (PST): "Peter T. Daniels"
Post by Peter T. Daniels
"Years ago," GG offered only "tree view," in a panel at the left side, and
would only display 10 messages at a time, and there was no way to change
that. In "page view," the messages are chronological, 25 at a time, and
unless the thread approaches its maximum of 1000 messages, going from one
"page" to the previous or next one takes no appreciable time. (It can slow
when the number of messages passes 600 or so; after 1000, each reply to a
message in the extant 1000 starts a new thread with the same title.)
There are no different views, for me.
Perhaps you have GG Professional.
I suppose he uses GG while logged in to google, which I assume results in a
UI with a number of other options.
Peter T. Daniels
2018-11-15 02:41:46 UTC
Permalink
Post by António Marques
Post by Ruud Harmsen
Wed, 14 Nov 2018 05:06:53 -0800 (PST): "Peter T. Daniels"
Post by Peter T. Daniels
"Years ago," GG offered only "tree view," in a panel at the left side, and
would only display 10 messages at a time, and there was no way to change
that. In "page view," the messages are chronological, 25 at a time, and
unless the thread approaches its maximum of 1000 messages, going from one
"page" to the previous or next one takes no appreciable time. (It can slow
when the number of messages passes 600 or so; after 1000, each reply to a
message in the extant 1000 starts a new thread with the same title.)
There are no different views, for me.
Perhaps you have GG Professional.
I suppose he uses GG while logged in to google, which I assume results in a
UI with a number of other options.
I don't know what "logging in to Google" would be. I simply type "g" in the
search box at the top of the Edge screen, and it fills in groups.google.com
/forum..., which takes me to the page where I can choose one of the five
groups I'm "subscribed to."
Ruud Harmsen
2018-11-15 07:37:52 UTC
Permalink
Wed, 14 Nov 2018 18:41:46 -0800 (PST): "Peter T. Daniels"
Post by Peter T. Daniels
I don't know what "logging in to Google" would be.
Do you have a white P in a purple circle in the upper right of the
screen, were I have an R? Then you are loggin on or in. Unless it's a
G for grammatin.
Post by Peter T. Daniels
I simply type "g" in the
search box at the top of the Edge screen, and it fills in groups.google.com
/forum..., which takes me to the page where I can choose one of the five
groups I'm "subscribed to."
--
Ruud Harmsen, http://rudhar.com
Peter T. Daniels
2018-11-15 12:28:52 UTC
Permalink
Post by Ruud Harmsen
Wed, 14 Nov 2018 18:41:46 -0800 (PST): "Peter T. Daniels"
Post by Peter T. Daniels
I don't know what "logging in to Google" would be.
Do you have a white P in a purple circle in the upper right of the
screen, were I have an R? Then you are loggin on or in. Unless it's a
G for grammatin.
Maybe that's where I have the Windows silhouette, the same graphic as when
I turn on the computer because I haven't provided a photo to go there. So
no, I am not "logged in to Google."

The number of assumptions you have made in the last day or so, based on
your own archaic and idiosyncratic usages, is breathtaking.
Post by Ruud Harmsen
Post by Peter T. Daniels
I simply type "g" in the
search box at the top of the Edge screen, and it fills in groups.google.com
/forum..., which takes me to the page where I can choose one of the five
groups I'm "subscribed to."
António Marques
2018-11-15 13:50:47 UTC
Permalink
Post by Peter T. Daniels
Post by António Marques
Post by Ruud Harmsen
Wed, 14 Nov 2018 05:06:53 -0800 (PST): "Peter T. Daniels"
Post by Peter T. Daniels
"Years ago," GG offered only "tree view," in a panel at the left side, and
would only display 10 messages at a time, and there was no way to change
that. In "page view," the messages are chronological, 25 at a time, and
unless the thread approaches its maximum of 1000 messages, going from one
"page" to the previous or next one takes no appreciable time. (It can slow
when the number of messages passes 600 or so; after 1000, each reply to a
message in the extant 1000 starts a new thread with the same title.)
There are no different views, for me.
Perhaps you have GG Professional.
I suppose he uses GG while logged in to google, which I assume results in a
UI with a number of other options.
I don't know what "logging in to Google" would be.
You don't have a google account? How then does google allow you to send
messages using GG? How does it know it's you sending them, as opposed to
whoever chose to use your name?
Either you have 'logged in' to google using your Edge at some point, or
someone did it for you.
Post by Peter T. Daniels
I simply type "g" in the
search box at the top of the Edge screen, and it fills in groups.google.com
/forum..., which takes me to the page where I can choose one of the five
groups I'm "subscribed to."
...and how does it know you're 'subscribed to' groups unless it knows it's
you? When you use someone else's computer, it won't show those groups.
Because you're not logged in there.

You may not know it, but whenever you open a page from Google, it knows
it's you who's opening it (or, at least, someone using the computer/browser
you've identified yourself in). That's how it knows what content to show
you, viz subscribed groups. That's harmless enough, but it gets much worse
- due to the widespread use of 'google analytics' and the '+1 G' button in
third party sites, Google knows most of what you access even if it's not
Google-related. Facebook is similar in that regard. And if they know, so do
all the shady entities they sell your data to.

Even if not 'logged in', they still can and do track a lot of what you do,
but in this day and age when everyone gleefully exposes their data, there
are diminishing returns in that.
Peter T. Daniels
2018-11-15 14:10:15 UTC
Permalink
Post by António Marques
Post by Peter T. Daniels
Post by António Marques
Post by Ruud Harmsen
Wed, 14 Nov 2018 05:06:53 -0800 (PST): "Peter T. Daniels"
Post by Peter T. Daniels
"Years ago," GG offered only "tree view," in a panel at the left side, and
would only display 10 messages at a time, and there was no way to change
that. In "page view," the messages are chronological, 25 at a time, and
unless the thread approaches its maximum of 1000 messages, going from one
"page" to the previous or next one takes no appreciable time. (It can slow
when the number of messages passes 600 or so; after 1000, each reply to a
message in the extant 1000 starts a new thread with the same title.)
There are no different views, for me.
Perhaps you have GG Professional.
I suppose he uses GG while logged in to google, which I assume results in a
UI with a number of other options.
I don't know what "logging in to Google" would be.
You don't have a google account? How then does google allow you to send
messages using GG? How does it know it's you sending them, as opposed to
whoever chose to use your name?
Either you have 'logged in' to google using your Edge at some point, or
someone did it for you.
I told you how I get to Google Groups. It's simply a url. It knows who I am,
because it takes me to my introductory page listing the five newsgroups I
loo at.
Post by António Marques
Post by Peter T. Daniels
I simply type "g" in the
search box at the top of the Edge screen, and it fills in groups.google.com
/forum..., which takes me to the page where I can choose one of the five
groups I'm "subscribed to."
...and how does it know you're 'subscribed to' groups unless it knows it's
you? When you use someone else's computer, it won't show those groups.
Because you're not logged in there.
I "log in" to Windows every time I (re)start the computer. I don't "log in"
to Google or its Groups.
Post by António Marques
You may not know it, but whenever you open a page from Google, it knows
it's you who's opening it (or, at least, someone using the computer/browser
you've identified yourself in). That's how it knows what content to show
you, viz subscribed groups. That's harmless enough, but it gets much worse
- due to the widespread use of 'google analytics' and the '+1 G' button in
third party sites, Google knows most of what you access even if it's not
Google-related. Facebook is similar in that regard. And if they know, so do
all the shady entities they sell your data to.
I have never clicked a "+1 G" button. I do not use gmail or Facebook or
Twitter. I briefly had Chrome when something went wrong with Internet
Explorer and a Geek at BestBuy's Geek Squad changed it (and didn't charge
me $29.95 for the visit). I shortly switched back to IE. That was, obviously,
a while ago.
Post by António Marques
Even if not 'logged in', they still can and do track a lot of what you do,
but in this day and age when everyone gleefully exposes their data, there
are diminishing returns in that.
António Marques
2018-11-15 19:25:50 UTC
Permalink
Post by Peter T. Daniels
Post by António Marques
Post by Peter T. Daniels
Post by António Marques
Post by Ruud Harmsen
Wed, 14 Nov 2018 05:06:53 -0800 (PST): "Peter T. Daniels"
Post by Peter T. Daniels
"Years ago," GG offered only "tree view," in a panel at the left side, and
would only display 10 messages at a time, and there was no way to change
that. In "page view," the messages are chronological, 25 at a time, and
unless the thread approaches its maximum of 1000 messages, going from one
"page" to the previous or next one takes no appreciable time. (It can slow
when the number of messages passes 600 or so; after 1000, each reply to a
message in the extant 1000 starts a new thread with the same title.)
There are no different views, for me.
Perhaps you have GG Professional.
I suppose he uses GG while logged in to google, which I assume results in a
UI with a number of other options.
I don't know what "logging in to Google" would be.
You don't have a google account? How then does google allow you to send
messages using GG? How does it know it's you sending them, as opposed to
whoever chose to use your name?
Either you have 'logged in' to google using your Edge at some point, or
someone did it for you.
I told you how I get to Google Groups. It's simply a url. It knows who I am,
because it takes me to my introductory page listing the five newsgroups I
loo at.
That's now _how_ or _why_ it knows who you are. It's how you know it knows.
Post by Peter T. Daniels
Post by António Marques
Post by Peter T. Daniels
I simply type "g" in the
search box at the top of the Edge screen, and it fills in groups.google.com
/forum..., which takes me to the page where I can choose one of the five
groups I'm "subscribed to."
...and how does it know you're 'subscribed to' groups unless it knows it's
you? When you use someone else's computer, it won't show those groups.
Because you're not logged in there.
I "log in" to Windows every time I (re)start the computer. I don't "log in"
to Google or its Groups.
You did, at some point, or someone did on your behalf.
Post by Peter T. Daniels
Post by António Marques
You may not know it, but whenever you open a page from Google, it knows
it's you who's opening it (or, at least, someone using the computer/browser
you've identified yourself in). That's how it knows what content to show
you, viz subscribed groups. That's harmless enough, but it gets much worse
- due to the widespread use of 'google analytics' and the '+1 G' button in
third party sites, Google knows most of what you access even if it's not
Google-related. Facebook is similar in that regard. And if they know, so do
all the shady entities they sell your data to.
I have never clicked a "+1 G" button.
You don't have to. The very fact that that button is displayed means that
the page you're viewing is integrated with google (every other news site
is). By that alone, google is notified of every page you read there.
Post by Peter T. Daniels
I do not use gmail
Even so, google only lets you 'subscribe to' groups if you're logged in.
That requires a google account, even if you don't use its mailing
capabilities.
Post by Peter T. Daniels
or Facebook or
Twitter.
That's wise.
Ruud Harmsen
2018-11-15 07:34:49 UTC
Permalink
Wed, 14 Nov 2018 21:32:08 -0000 (UTC): António Marques
Post by António Marques
Post by Ruud Harmsen
Wed, 14 Nov 2018 05:06:53 -0800 (PST): "Peter T. Daniels"
Post by Peter T. Daniels
"Years ago," GG offered only "tree view," in a panel at the left side, and
would only display 10 messages at a time, and there was no way to change
that. In "page view," the messages are chronological, 25 at a time, and
unless the thread approaches its maximum of 1000 messages, going from one
"page" to the previous or next one takes no appreciable time. (It can slow
when the number of messages passes 600 or so; after 1000, each reply to a
message in the extant 1000 starts a new thread with the same title.)
There are no different views, for me.
Perhaps you have GG Professional.
I suppose he uses GG while logged in to google,
So am I.
Post by António Marques
which I assume results in a
UI with a number of other options.
Not that I know or can find.
--
Ruud Harmsen, http://rudhar.com
peteolcott
2018-11-14 01:56:34 UTC
Permalink
Post by Ruud Harmsen
Testing accented letters, encoded in Latin-1, ISO8859-1, Windows
1252/CP1252
Accent grave: àèìòù
Accent aigu: áéíóúý
Accent circonflex: âêôîû
Diaeresis: äëïöüÿ
Tilde: ãõ
Cedilla: ç
Accent grave: ÀÈÌÒÙ
Accent aigu: ÁÉÍÓÚÝ
Accent circonflex: ÂÊÔÎÛ
Diaeresis: ÄËÏÖÜŸ
Tilde: ÃÕ
Cedilla: Ç
àèìòùáéíóúýâêôîûäëïöüÿãõç
I don't think that you really have to test this.
Browsers such as Chrome can read newsgroups in Google Groups and
display any character that the local user's font can represents.

If you really want to "test" character sets on newsgroups you should
examine the character sets of desktop and smart phone operating systems.
Ruud Harmsen
2018-11-14 08:06:10 UTC
Permalink
Post by peteolcott
Post by Ruud Harmsen
Testing accented letters, encoded in Latin-1, ISO8859-1, Windows
1252/CP1252
Accent grave: àèìòù
Accent aigu: áéíóúý
Accent circonflex: âêôîû
Diaeresis: äëïöüÿ
Tilde: ãõ
Cedilla: ç
Accent grave: ÀÈÌÒÙ
Accent aigu: ÁÉÍÓÚÝ
Accent circonflex: ÂÊÔÎÛ
Diaeresis: ÄËÏÖÜŸ
Tilde: ÃÕ
Cedilla: Ç
àèìòùáéíóúýâêôîûäëïöüÿãõç
I don't think that you really have to test this.
Browsers such as Chrome can read newsgroups in Google Groups and
display any character that the local user's font can represents.
That wasn't the object of the test.
Post by peteolcott
If you really want to "test" character sets on newsgroups you should
examine the character sets of desktop and smart phone operating systems.
No. That is not the point.
--
Ruud Harmsen, http://rudhar.com
Loading...