Discussion:
EBCDIC (was: Json table characters)
(too old to reply)
Paul Gilmartin
2018-08-07 20:16:04 UTC
Permalink
Don't we celebrate diversity.
In cusine, il va sans dire. In character sets, not so much. With the advent of Unicode and UTF-8, I wish those other code pages would go away. Or at least that every OS tagged character files with the code page and did the translations.
And that Classic data sets could be so tagged. Also JES spool files. There is
a CCSID keyword on DD statements, but severely restricted.

ISPF Edit has a surprising (cf. Samuel Johnson's dog) facility for recognizing
tags and dealing with many code pages, even UTF-8, *provided* their
characters are supported by the terminal's character set. I wish that TSO/ISPF
supported UTF-8 terminal emulators. If there were UTF-8 terminal emulators.

What's the history of IBM-1047? Why does it seem to be controversial?
Does it have the same set of printable glyphs as IBM-037 or IBM-500?
What need impelled it?

-- gil

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN
Jesse 1 Robinson
2018-08-07 21:56:54 UTC
Permalink
Can't wait till Friday. For years I've used the analogy of a dog walking on its hind legs. Never knew the origin. When I saw this reference to Samuel Johnson, I got a hunch to look it up. Voila! Thanks!

.
.
J.O.Skip Robinson
Southern California Edison Company
Electric Dragon Team Paddler
SHARE MVS Program Co-Manager
323-715-0595 Mobile
626-543-6132 Office ⇐=== NEW
***@sce.com

-----Original Message-----
From: IBM Mainframe Discussion List [mailto:IBM-***@LISTSERV.UA.EDU] On Behalf Of Paul Gilmartin
Sent: Tuesday, August 07, 2018 1:16 PM
To: IBM-***@LISTSERV.UA.EDU
Subject: (External):EBCDIC (was: Json table characters)
Don't we celebrate diversity.
In cusine, il va sans dire. In character sets, not so much. With the advent of Unicode and UTF-8, I wish those other code pages would go away. Or at least that every OS tagged character files with the code page and did the translations.
And that Classic data sets could be so tagged. Also JES spool files. There is a CCSID keyword on DD statements, but severely restricted.

ISPF Edit has a surprising (cf. Samuel Johnson's dog) facility for recognizing tags and dealing with many code pages, even UTF-8, *provided* their characters are supported by the terminal's character set. I wish that TSO/ISPF supported UTF-8 terminal emulators. If there were UTF-8 terminal emulators.

What's the history of IBM-1047? Why does it seem to be controversial?
Does it have the same set of printable glyphs as IBM-037 or IBM-500?
What need impelled it?

-- gil

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions, send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN


----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN
David Crayford
2018-08-08 02:20:18 UTC
Permalink
Post by Paul Gilmartin
What's the history of IBM-1047? Why does it seem to be controversial?
Does it have the same set of printable glyphs as IBM-037 or IBM-500?
What need impelled it?
Good question! Do you know the answer? And don't get me started on the
line-feed/newline x'15' fiasco!

I use IBM-1047 for the projects I work on because I work in z/OS UNIX
but other teams use IBM-037 because that was the default in their
terminal emulators many years ago.
We've got C/C++ code with different square brackets depending on the
project which is incredibly annoying if the source is in a PDS so it
can't be tagged. IDE's like RD/z (or
whatever it's called now) do a good job solving code page hell.

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN
Charles Mills
2018-08-08 02:34:21 UTC
Permalink
Isn't there a pragma tag codepage?



CharlesSent from a mobile; please excuse the brevity.
What's the history of IBM-1047?  Why does it seem to be controversial?
Does it have the same set of printable glyphs as IBM-037 or IBM-500?
What need impelled it?
Good question! Do you know the answer? And don't get me started on the
line-feed/newline x'15' fiasco!

I use IBM-1047 for the projects I work on because I work in z/OS UNIX
but other teams use IBM-037 because that was the default in their
terminal emulators many years ago.
We've got C/C++ code with different square brackets depending on the
project which is incredibly annoying if the source is in a PDS so it
can't be tagged. IDE's like RD/z (or
whatever it's called now) do a good job solving code page hell.

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN


----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN
David Crayford
2018-08-08 03:13:53 UTC
Permalink
Post by Charles Mills
Isn't there a pragma tag codepage?
Yes, and we use it, but it doesn't help if you're using the ISPF editor
to view/edit source code written in a different code page to your
emulator setting. We're doing a lot
of JSON work right now and do everything in IBM-1047 before converting
to UTF-8.

I personally prefer to do everything in the UNIX file system where file
tagging solves the problem but most of our older products use PDS data
sets for source.
Post by Charles Mills
CharlesSent from a mobile; please excuse the brevity.
What's the history of IBM-1047?  Why does it seem to be controversial?
Does it have the same set of printable glyphs as IBM-037 or IBM-500?
What need impelled it?
Good question! Do you know the answer? And don't get me started on the
line-feed/newline x'15' fiasco!
I use IBM-1047 for the projects I work on because I work in z/OS UNIX
but other teams use IBM-037 because that was the default in their
terminal emulators many years ago.
We've got C/C++ code with different square brackets depending on the
project which is incredibly annoying if the source is in a PDS so it
can't be tagged. IDE's like RD/z (or
whatever it's called now) do a good job solving code page hell.
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN
Charles Mills
2018-08-08 03:20:34 UTC
Permalink
In GSKCMS from a PDS:

#if defined(__COMPILER_VER__)
#pragma filetag("IBM-1047")
#pragma nomargins nosequence
#endif

Ditto GSKSSL and GSKTYPES.

Charles


-----Original Message-----
From: IBM Mainframe Discussion List [mailto:IBM-***@LISTSERV.UA.EDU] On Behalf Of Charles Mills
Sent: Tuesday, August 7, 2018 7:34 PM
To: IBM-***@LISTSERV.UA.EDU
Subject: Re: EBCDIC (was: Json table characters)

Isn't there a pragma tag codepage?



CharlesSent from a mobile; please excuse the brevity.
Post by Paul Gilmartin
What's the history of IBM-1047? Why does it seem to be controversial?
Does it have the same set of printable glyphs as IBM-037 or IBM-500?
What need impelled it?
Good question! Do you know the answer? And don't get me started on the
line-feed/newline x'15' fiasco!

I use IBM-1047 for the projects I work on because I work in z/OS UNIX
but other teams use IBM-037 because that was the default in their
terminal emulators many years ago.
We've got C/C++ code with different square brackets depending on the
project which is incredibly annoying if the source is in a PDS so it
can't be tagged. IDE's like RD/z (or
whatever it's called now) do a good job solving code page hell.

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN
Seymour J Metz
2018-08-08 13:36:22 UTC
Permalink
Yeah, and on my PC there is a similar issue with ¬ (Logical Not); is it AA or AC? That's a major issue if you/re coding PL/I or REXX.

--
Shmuel (Seymour J.) Metz
http://mason.gmu.edu/~smetz3

________________________________________
From: IBM Mainframe Discussion List <IBM-***@listserv.ua.edu> on behalf of David Crayford <***@GMAIL.COM>
Sent: Tuesday, August 7, 2018 10:19 PM
To: IBM-***@listserv.ua.edu
Subject: Re: EBCDIC (was: Json table characters)
Post by Paul Gilmartin
What's the history of IBM-1047? Why does it seem to be controversial?
Does it have the same set of printable glyphs as IBM-037 or IBM-500?
What need impelled it?
Good question! Do you know the answer? And don't get me started on the
line-feed/newline x'15' fiasco!

I use IBM-1047 for the projects I work on because I work in z/OS UNIX
but other teams use IBM-037 because that was the default in their
terminal emulators many years ago.
We've got C/C++ code with different square brackets depending on the
project which is incredibly annoying if the source is in a PDS so it
can't be tagged. IDE's like RD/z (or
whatever it's called now) do a good job solving code page hell.

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN
David Crayford
2018-08-08 13:43:27 UTC
Permalink
For REXX I have used the / operator for years now which is portable. The
logical not character has not aged well.
Post by Seymour J Metz
Yeah, and on my PC there is a similar issue with ¬ (Logical Not); is it AA or AC? That's a major issue if you/re coding PL/I or REXX.
--
Shmuel (Seymour J.) Metz
http://mason.gmu.edu/~smetz3
________________________________________
Sent: Tuesday, August 7, 2018 10:19 PM
Subject: Re: EBCDIC (was: Json table characters)
Post by Paul Gilmartin
What's the history of IBM-1047? Why does it seem to be controversial?
Does it have the same set of printable glyphs as IBM-037 or IBM-500?
What need impelled it?
Good question! Do you know the answer? And don't get me started on the
line-feed/newline x'15' fiasco!
I use IBM-1047 for the projects I work on because I work in z/OS UNIX
but other teams use IBM-037 because that was the default in their
terminal emulators many years ago.
We've got C/C++ code with different square brackets depending on the
project which is incredibly annoying if the source is in a PDS so it
can't be tagged. IDE's like RD/z (or
whatever it's called now) do a good job solving code page hell.
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN
Charles Mills
2018-08-08 13:53:05 UTC
Permalink
I got into the habit of <> for not equal in Rexx for that reason. Looking at Cowlishaw now I see that there is no strict variant of <>.

You mean \ right?

Charles


-----Original Message-----
From: IBM Mainframe Discussion List [mailto:IBM-***@LISTSERV.UA.EDU] On Behalf Of David Crayford
Sent: Wednesday, August 8, 2018 6:43 AM
To: IBM-***@LISTSERV.UA.EDU
Subject: Re: EBCDIC (was: Json table characters)

For REXX I have used the / operator for years now which is portable. The
logical not character has not aged well.
Post by Seymour J Metz
Yeah, and on my PC there is a similar issue with ¬ (Logical Not); is it AA or AC? That's a major issue if you/re coding PL/I or REXX.
--
Shmuel (Seymour J.) Metz
http://mason.gmu.edu/~smetz3
________________________________________
Sent: Tuesday, August 7, 2018 10:19 PM
Subject: Re: EBCDIC (was: Json table characters)
Post by Paul Gilmartin
What's the history of IBM-1047? Why does it seem to be controversial?
Does it have the same set of printable glyphs as IBM-037 or IBM-500?
What need impelled it?
Good question! Do you know the answer? And don't get me started on the
line-feed/newline x'15' fiasco!
I use IBM-1047 for the projects I work on because I work in z/OS UNIX
but other teams use IBM-037 because that was the default in their
terminal emulators many years ago.
We've got C/C++ code with different square brackets depending on the
project which is incredibly annoying if the source is in a PDS so it
can't be tagged. IDE's like RD/z (or
whatever it's called now) do a good job solving code page hell.
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN
David Crayford
2018-08-08 15:53:51 UTC
Permalink
Yes, of course \ thanks for the correction. Lua uses ~ which is also a
code page issue. But it also has a "not" keyword which I like. C++ also
has "not" but it's not commonly used.
Post by Charles Mills
I got into the habit of <> for not equal in Rexx for that reason. Looking at Cowlishaw now I see that there is no strict variant of <>.
You mean \ right?
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN
Charles Mills
2018-08-08 16:55:15 UTC
Permalink
Thank you -- I did not know about not. I see here https://en.cppreference.com/w/cpp/language/operator_alternative that there is a whole family of these including not_eq.

I have encountered IBM files that used the C++ trigraphs: ??< for { and so forth. What an unreadable mess! (IMHO, obviously)

Charles


-----Original Message-----
From: IBM Mainframe Discussion List [mailto:IBM-***@LISTSERV.UA.EDU] On Behalf Of David Crayford
Sent: Wednesday, August 8, 2018 8:54 AM
To: IBM-***@LISTSERV.UA.EDU
Subject: Re: EBCDIC (was: Json table characters)

Yes, of course \ thanks for the correction. Lua uses ~ which is also a
code page issue. But it also has a "not" keyword which I like. C++ also
has "not" but it's not commonly used.

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN
Thomas David Rivers
2018-08-10 20:34:01 UTC
Permalink
Post by Charles Mills
Thank you -- I did not know about not. I see here https://en.cppreference.com/w/cpp/language/operator_alternative that there is a whole family of these including not_eq.
I have encountered IBM files that used the C++ trigraphs: ??< for { and so forth. What an unreadable mess! (IMHO, obviously)
Charles
C++17 removes support for Trigraphs in the language... but, I'm sure
many compilers will
continue to support them as an extension.

- Dave R. -
--
***@dignus.com Work: (919) 676-0847
Get your mainframe programming tools at http://www.dignus.com
Tony Harminc
2018-08-08 03:38:17 UTC
Permalink
On 7 August 2018 at 16:15, Paul Gilmartin
Post by Paul Gilmartin
What's the history of IBM-1047?
It was IBM's answer to the SHARE ASCII/EBCDIC Character Set [ÆCS] Task
Force report "ASCII and EBCDIC Character Set and Code Issues in
Systems Application
Architecture". Sometimes considered to be IBM's riposte to the NIH
(that's Not Invented Here - not the [US] National Institutes of
Health) "Codepage 37 version 2" proposed in that paper.
Post by Paul Gilmartin
Why does it seem to be controversial?
I don't know - is it?
Post by Paul Gilmartin
Does it have the same set of printable glyphs as IBM-037 or IBM-500?
Yes, exactly. Those CPs, and quite a few more, such as 285, 273, etc.
encode what IBM calls Character Set (CS) 697. This CS (or Character
Repertoire in ISO terminology) is often called Latin-1, though Latin-1
is also used to mean the ASCII-based encoding of CS 697, which is
IBM's CP 819.
Post by Paul Gilmartin
What need impelled it?
It's worth getting a copy of the SHARE ÆCS report to see what the
state of character encoding and standardization was like in 1989.

Tony H.

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN
Seymour J Metz
2018-08-08 14:19:08 UTC
Permalink
Samuel Johnson was a misogynist.

Asfor ISPF, the solution, IMHO, for IBM to open source the WSA and for someone to port it to, e.g., Linux. Getting ISPF to handle UTF-8 in the session would be gravy.


--
Shmuel (Seymour J.) Metz
http://mason.gmu.edu/~smetz3

________________________________________
From: IBM Mainframe Discussion List <IBM-***@listserv.ua.edu> on behalf of Paul Gilmartin <0000000433f07816-dmarc-***@listserv.ua.edu>
Sent: Tuesday, August 7, 2018 4:15 PM
To: IBM-***@listserv.ua.edu
Subject: EBCDIC (was: Json table characters)
Don't we celebrate diversity.
In cusine, il va sans dire. In character sets, not so much. With the advent of Unicode and UTF-8, I wish those other code pages would go away. Or at least that every OS tagged character files with the code page and did the translations.
And that Classic data sets could be so tagged. Also JES spool files. There is
a CCSID keyword on DD statements, but severely restricted.

ISPF Edit has a surprising (cf. Samuel Johnson's dog) facility for recognizing
tags and dealing with many code pages, even UTF-8, *provided* their
characters are supported by the terminal's character set. I wish that TSO/ISPF
supported UTF-8 terminal emulators. If there were UTF-8 terminal emulators.

What's the history of IBM-1047? Why does it seem to be controversial?
Does it have the same set of printable glyphs as IBM-037 or IBM-500?
What need impelled it?

-- gil

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN
Paul Gilmartin
2018-08-08 17:03:40 UTC
Permalink
Post by Charles Mills
I got into the habit of <> for not equal in Rexx for that reason. Looking at Cowlishaw now I see that there is no strict variant of <>.
Indeed. Sometimes, contemptuously, I've used ( 1 - ( X == Y ) ). ( I hadn't yet
learned "\==".)
Post by Charles Mills
Asfor ISPF, the solution, IMHO, for IBM to open source the WSA and for someone to port it to, e.g., Linux. Getting ISPF to handle UTF-8 in the session would be gravy.
I wish something similar to WSA existed as a z-based X11 client: write
once, access almost anywhere. NFS has solved file sharing requirements
nicely for me.
Post by Charles Mills
Post by Paul Gilmartin
Does it have the same set of printable glyphs as IBM-037 or IBM-500?
Yes, exactly. Those CPs, and quite a few more, such as 285, 273, etc.
encode what IBM calls Character Set (CS) 697. This CS (or Character
Repertoire in ISO terminology) is often called Latin-1, though Latin-1
is also used to mean the ASCII-based encoding of CS 697, which is
IBM's CP 819.
Post by Paul Gilmartin
What need impelled it?
It's worth getting a copy of the SHARE ÆCS report to see what the
state of character encoding and standardization was like in 1989.
Is it available?

Just curious. It's hard for me to imagine that the problem of
a surfeit of code pages could be mitigated by adding one more.

-- gil

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN
Tony Harminc
2018-08-09 19:14:32 UTC
Permalink
On 8 August 2018 at 13:03, Paul Gilmartin
Post by Paul Gilmartin
Post by Tony Harminc
It's worth getting a copy of the SHARE ÆCS report to see what the
state of character encoding and standardization was like in 1989.
Is it available?
I thought I had seen it on Bitsavers, but though there are references
to it I don't see it there, or indeed anywhere else. Well I have an
original paper copy, so time to get scanning...

Tony H.

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN
Paul Gilmartin
2018-08-08 17:23:31 UTC
Permalink
Post by Charles Mills
Thank you -- I did not know about not. I see here https://en.cppreference.com/w/cpp/language/operator_alternative that there is a whole family of these including not_eq.
I have encountered IBM files that used the C++ trigraphs: ??< for { and so forth. What an unreadable mess! (IMHO, obviously)
I understand that some compilers have a #pragma that disables trigraphs.
Alas, "#" may not be entirely portable and may need to be coded as a
trigraph.

("???????" has been used as a TBD by some programmers.)

I know some ASCII-partisan programmers who blame the whole trigraph
mess on EBCDIC and its Babel of code pages.

Bill Waite who has his own conventions of portable code has
a convention of starting each input file with a line consisting
of the entire character set, used as a translate table afterward.

Some processors have very good heuristics for detecting UTF-8,
which has enough redundancy to enable the detection. There's
some ambiguity in that ASCII is a proper subset of UTF-8.

Likewise, HLASM can by inspection distinguish ASCII from EBCDIC
source. But only CP 819 and CP 037, respectively.

-- gil

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN
Charles Mills
2018-08-08 18:08:43 UTC
Permalink
Yes, ??= is the trigraph for #. Talk about hash!

Trigraphs go away in C++17.

Charles


-----Original Message-----
From: IBM Mainframe Discussion List [mailto:IBM-***@LISTSERV.UA.EDU] On Behalf Of Paul Gilmartin
Sent: Wednesday, August 8, 2018 10:23 AM
To: IBM-***@LISTSERV.UA.EDU
Subject: Re: EBCDIC (was: Json table characters)
Post by Charles Mills
Thank you -- I did not know about not. I see here https://en.cppreference.com/w/cpp/language/operator_alternative that there is a whole family of these including not_eq.
I have encountered IBM files that used the C++ trigraphs: ??< for { and so forth. What an unreadable mess! (IMHO, obviously)
I understand that some compilers have a #pragma that disables trigraphs.
Alas, "#" may not be entirely portable and may need to be coded as a
trigraph.

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN
David Crayford
2018-08-09 11:18:31 UTC
Permalink
Post by Charles Mills
Yes, ??= is the trigraph for #. Talk about hash!
Trigraphs go away in C++17.
Michael Wong from IBM has already stated that IBM will still support
trigraphs in C++ as an extension. C hasn't deprecated them.
But seeing as #pragma filetag() is already an extension I see no reason
why anybody would use them. I have no idea why
the health checker header files in SYS1.SIEAHDR.H(HZS*) chose to use
them. It's an act of madness!
Post by Charles Mills
Charles
-----Original Message-----
Sent: Wednesday, August 8, 2018 10:23 AM
Subject: Re: EBCDIC (was: Json table characters)
Post by Charles Mills
Thank you -- I did not know about not. I see here https://en.cppreference.com/w/cpp/language/operator_alternative that there is a whole family of these including not_eq.
I have encountered IBM files that used the C++ trigraphs: ??< for { and so forth. What an unreadable mess! (IMHO, obviously)
I understand that some compilers have a #pragma that disables trigraphs.
Alas, "#" may not be entirely portable and may need to be coded as a
trigraph.
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN
Paul Gilmartin
2018-08-09 14:22:53 UTC
Permalink
Post by Charles Mills
Yes, ??= is the trigraph for #. Talk about hash!
Trigraphs go away in C++17.
Michael Wong from IBM has already stated that IBM will still support trigraphs in C++ as an extension. C hasn't deprecated them.
But seeing as #pragma filetag() is already an extension I see no reason why anybody would use them. I have no idea why
the health checker header files in SYS1.SIEAHDR.H(HZS*) chose to use them. It's an act of madness!
About 40 years ago, IBM considered Pascal essential:
bitsavers.org/pdf/ibm/370/pascal/SH20-6168-1_VS_PASCAL_Dec81.pdf
Pascal/VS uses square brackets, '[' and ']', in the declaration of arrays.
Because these symbols are not directly available on many I/O devices,
the symbols '(.' and '.)' may be used as an equivalent to square brackets.

I believe the original VM TCP/IP stack (a user developed product) was
written in Pascal.

And IBM devised the dreadful XLATE macro:
https://www.ibm.com/support/knowledgecenter/en/SSLTBW_2.3.0/com.ibm.zos.v2r3.idar100/g1059.htm
IGC0010C XLATE (translate to and from ASCII (BSAM and QSAM))

I conjecture its primary objective was to map ASCII to code
points supported on existing printers, keyboards, and displays,
but not necessarily with a faithful visual representation.
IIRC, it mapped '[' and ']' to x'4A' and X'5A'. In CP 037 and
CP 1047 these are '¢' and '!'. In CP 500 they are '[' and ']'.
Does anyone recall the chronology? I imagine bitter internal
territorial fights over the precious code points.

Trigraphs:
https://stackoverflow.com/questions/27601706/c1z-why-not-remove-digraphs-along-with-trigraphs
"C++1z will remove trigraphs. IBM heavily opposed this (here and here)
so there seem to be arguments for both sides of removal/non removal."
citing:
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2009/n2910.pdf
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4210.pdf

The arguments they raise are:
o Continued support of existing source code relying on trigraphs.
o Support of antique peripheral devices with limited vocabulary.

C/C++ do not use "??" as operators in code. However, they may
occur in quoted strings and must be converted to support '??/n'
and '??/"'. for examples. Supporting them there is not an
extension but a true incompatibility.

-- gil

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN
Anne & Lynn Wheeler
2018-08-09 20:01:40 UTC
Permalink
This post might be inappropriate. Click to display it.
Paul Gilmartin
2018-08-09 19:42:58 UTC
Permalink
Post by Tony Harminc
Post by Paul Gilmartin
Post by Tony Harminc
It's worth getting a copy of the SHARE ÆCS report to see what the
state of character encoding and standardization was like in 1989.
Is it available?
I thought I had seen it on Bitsavers, but though there are references
to it I don't see it there, or indeed anywhere else. Well I have an
original paper copy, so time to get scanning...
Thanks. Will it go to Bitsavers? They do an incredible job of some sort
of 2-layer PDFs which are simultaneously images (even with fingerprints)
and searchable text. I've found:
https://en.wikipedia.org/wiki/EBCDIC_037
https://en.wikipedia.org/wiki/EBCDIC_037-2
https://en.wikipedia.org/wiki/EBCDIC_1047
Differences in a handful of code points; not enough to bring world
peace or solve climate change.

On Linux, the script below compares the output of "dd conv={ebcdic|ibm}"
to pages 037, 500, and 1047. The best match seems to be "conv-ibm"
to IBM-1047.

I don't believe that the "dd" utility per se motivated a serious requirement,
but does "dd conv=ibm" reflect otherwise prevalent practice?
(And we still have the LF-NL irritant.)

# #################################
#! /bin/sh -x

S=$( awk 'BEGIN {
for ( I=32; I<128; I++ ) printf( "%c", I ) }'; )

around() {
echo; echo; echo EBCDIC "$1"
printf %s "$S" | dd conv=ebcdic |
iconv -f "$1" -t ISO8859-1

echo; echo IBM "$1"
printf '%s\n' "$S"
printf %s "$S" | dd conv=ibm |
iconv -f "$1" -t ISO8859-1
}

around CSIBM037
around CSIBM500
around IBM-1047
echo
exit
# #################################

Thanks again,
gil

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN
Peter Relson
2018-08-10 13:30:12 UTC
Permalink
I have no idea why the health checker header files in
SYS1.SIEAHDR.H(HZS*) chose to use them.
Because no one was willing to take a stand that everyone has a system like
yours that can handle the braces, at the expense of someone who did not.

Nevertheless, it seems unlikely that there will be additional use of the
trigraphs going forward.

And possibly, if specifically asked, some existing files could be changed
to remove them in the future.

Peter Relson
z/OS Core Technology Design


----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN
David Crayford
2018-08-10 14:42:39 UTC
Permalink
Post by Peter Relson
I have no idea why the health checker header files in
SYS1.SIEAHDR.H(HZS*) chose to use them.
Because no one was willing to take a stand that everyone has a system like
yours that can handle the braces, at the expense of someone who did not.
You make a good point Peter and I certainly don't mean any disrespect to
the original authors of those header files.
It's just IMO trigraphs are worse then using a codepage such as 1047
which is the ubiquitous C codepage on z/OS.
Post by Peter Relson
Nevertheless, it seems unlikely that there will be additional use of the
trigraphs going forward.
And possibly, if specifically asked, some existing files could be changed
to remove them in the future.
Peter Relson
z/OS Core Technology Design
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN
Thomas David Rivers
2018-08-10 20:38:26 UTC
Permalink
Post by David Crayford
Post by Peter Relson
I have no idea why the health checker header files in
SYS1.SIEAHDR.H(HZS*) chose to use them.
Because no one was willing to take a stand that everyone has a system like
yours that can handle the braces, at the expense of someone who did not.
You make a good point Peter and I certainly don't mean any disrespect
to the original authors of those header files.
It's just IMO trigraphs are worse then using a codepage such as 1047
which is the ubiquitous C codepage on z/OS.
Post by Peter Relson
Nevertheless, it seems unlikely that there will be additional use of the
#pragma codepage presents its on set of problems.

Consider, for instance, you move the source into a Git repository
on a remote (and ASCII) system. And, if you have a cross-compiler,
might consider compiling on that remote system.

Now, your source claims to be in code-page 1047 - but it certainly isn't
likely
to be so, since the move to the ASCII system rendered it in ASCII.

This means your cross-compiler has to quietly ignore the #pragma codepage
and hope that your transmission to the ASCII system did "the right thing."

Trigraphs are hideously ugly; but they did produce portable source files
that could be moved around and compiled.

As I mentioned in another post; C++17 removed support for Trigraphs
from their language definition.... so, they are generally going away...

- Dave R. -
--
***@dignus.com Work: (919) 676-0847
Get your mainframe programming tools at http://www.dignus.com
Paul Gilmartin
2018-08-10 22:05:42 UTC
Permalink
Post by Thomas David Rivers
#pragma codepage presents its on set of problems.
Consider, for instance, you move the source into a Git repository
on a remote (and ASCII) system. And, if you have a cross-compiler,
might consider compiling on that remote system.
Now, your source claims to be in code-page 1047 - but it certainly isn't likely
to be so, since the move to the ASCII system rendered it in ASCII.
This means your cross-compiler has to quietly ignore the #pragma codepage
and hope that your transmission to the ASCII system did "the right thing."
Does FTP or another transport vehicle by defaut set SBDATACONN to the
tagged character set?

This is a general problem with self-defining data[1]. If someone sends me
email from an ASCII system with "Content-type: text/plain; charset=ISO8859-1",
it arrives in my VM inbox obviously translated to a variant (there are several)
of EBCDIC, with the MIME header still saying (however in EBCDIC) "ISO8859-1".
Shouldn't it be converted to reflect the translation?

If I "pax" an EBCDIC file tagged with separator=NL and request conversion
to ASCII, pax correctly tags the file with the target charset, converts the
NLs to LFs, but still leaves "separator=NL". SR. WAD.

Converting self-defining data should adjust the metadata.
Post by Thomas David Rivers
Trigraphs are hideously ugly; but they did produce portable source files
that could be moved around and compiled.
As I mentioned in another post; C++17 removed support for Trigraphs
from their language definition.... so, they are generally going away...
I hate EBCDIC!

[1] "What's self-defining data?"
"I dunno. Why don't you ask it!?"

-- gil

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN
Loading...