Discussion:
LLVM, volatile and async VMS I/O and system calls
(too old to reply)
Simon Clubley
2021-09-22 12:28:08 UTC
Permalink
Jan-Erik's questions about ASTs in COBOL have reminded me about something
I asked a while back.

VMS I/O and system calls are much more asynchronous than on other operating
systems and data can appear in buffers and variables in general can be
changed outside of the normal sequence points (such as at a function call
boundary).

With the move to LLVM, and its different optimiser, have any examples
appeared in VMS code for x86-64 where volatile attributes are now required
on variable definitions where you would have got away with not using them
before (even if technically, they should have been marked as volatile anyway) ?

Just curious if there's any places in code running on VMS x86-64 that will
need to cleaned up to do things in the correct way that you would have
got away with doing less correctly previously.

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
Arne Vajhøj
2021-09-22 13:06:26 UTC
Permalink
Post by Simon Clubley
Jan-Erik's questions about ASTs in COBOL have reminded me about something
I asked a while back.
VMS I/O and system calls are much more asynchronous than on other operating
systems and data can appear in buffers and variables in general can be
changed outside of the normal sequence points (such as at a function call
boundary).
With the move to LLVM, and its different optimiser, have any examples
appeared in VMS code for x86-64 where volatile attributes are now required
on variable definitions where you would have got away with not using them
before (even if technically, they should have been marked as volatile anyway) ?
Just curious if there's any places in code running on VMS x86-64 that will
need to cleaned up to do things in the correct way that you would have
got away with doing less correctly previously.
To state the obvious.

Correct C aka C code with defined behavior by C standard will work
on any standard compliant C compiler.

C code with implementation specific or undefined behavior is
throwing a dice.

Maybe John Reagan has some ideas about what may break, but I cannot see
VSI systematic document how Itanium to x86-64 migration will impact
C code with implementation specific or undefined behavior.

Arne
John Reagan
2021-09-22 13:12:12 UTC
Permalink
Post by Arne Vajhøj
Post by Simon Clubley
Jan-Erik's questions about ASTs in COBOL have reminded me about something
I asked a while back.
VMS I/O and system calls are much more asynchronous than on other operating
systems and data can appear in buffers and variables in general can be
changed outside of the normal sequence points (such as at a function call
boundary).
With the move to LLVM, and its different optimiser, have any examples
appeared in VMS code for x86-64 where volatile attributes are now required
on variable definitions where you would have got away with not using them
before (even if technically, they should have been marked as volatile anyway) ?
Just curious if there's any places in code running on VMS x86-64 that will
need to cleaned up to do things in the correct way that you would have
got away with doing less correctly previously.
To state the obvious.
Correct C aka C code with defined behavior by C standard will work
on any standard compliant C compiler.
C code with implementation specific or undefined behavior is
throwing a dice.
Maybe John Reagan has some ideas about what may break, but I cannot see
VSI systematic document how Itanium to x86-64 migration will impact
C code with implementation specific or undefined behavior.
Arne
Simon, you asked this exact same question before. You can search back for the guesses.

Moving architectures can always expose bugs regardless of platform, OS, or compiler.

GEM has different optimizations on Alpha vs Itanium. Did you have to add any volatiles in that transition?

Linux people, how many people have code that works with -O0 and -O1 but not with -O3 or-Ofast?

The current cross-compilers are no-optimize so there is no real world experience for missing volatiles.
abrsvc
2021-09-22 14:00:52 UTC
Permalink
Post by Arne Vajhøj
Post by Simon Clubley
Jan-Erik's questions about ASTs in COBOL have reminded me about something
I asked a while back.
VMS I/O and system calls are much more asynchronous than on other operating
systems and data can appear in buffers and variables in general can be
changed outside of the normal sequence points (such as at a function call
boundary).
With the move to LLVM, and its different optimiser, have any examples
appeared in VMS code for x86-64 where volatile attributes are now required
on variable definitions where you would have got away with not using them
before (even if technically, they should have been marked as volatile anyway) ?
Just curious if there's any places in code running on VMS x86-64 that will
need to cleaned up to do things in the correct way that you would have
got away with doing less correctly previously.
To state the obvious.
Correct C aka C code with defined behavior by C standard will work
on any standard compliant C compiler.
C code with implementation specific or undefined behavior is
throwing a dice.
Maybe John Reagan has some ideas about what may break, but I cannot see
VSI systematic document how Itanium to x86-64 migration will impact
C code with implementation specific or undefined behavior.
Arne
Simon, you asked this exact same question before. You can search back for the guesses.
Moving architectures can always expose bugs regardless of platform, OS, or compiler.
GEM has different optimizations on Alpha vs Itanium. Did you have to add any volatiles in that transition?
Linux people, how many people have code that works with -O0 and -O1 but not with -O3 or-Ofast?
The current cross-compilers are no-optimize so there is no real world experience for missing volatiles.
I would also ask why does seemingly every question have a negative bent toward OpenVMS?
Why is what LLVM does "correct' where potentially what OpenVMS does buggy?
Isn't it possible that a correct sequence on OpenVMS can reveal a bug in LLVM where an implementation is not correct in all cases?
Bob Gezelter
2021-09-22 14:27:56 UTC
Permalink
Post by abrsvc
Post by Arne Vajhøj
Post by Simon Clubley
Jan-Erik's questions about ASTs in COBOL have reminded me about something
I asked a while back.
VMS I/O and system calls are much more asynchronous than on other operating
systems and data can appear in buffers and variables in general can be
changed outside of the normal sequence points (such as at a function call
boundary).
With the move to LLVM, and its different optimiser, have any examples
appeared in VMS code for x86-64 where volatile attributes are now required
on variable definitions where you would have got away with not using them
before (even if technically, they should have been marked as volatile anyway) ?
Just curious if there's any places in code running on VMS x86-64 that will
need to cleaned up to do things in the correct way that you would have
got away with doing less correctly previously.
To state the obvious.
Correct C aka C code with defined behavior by C standard will work
on any standard compliant C compiler.
C code with implementation specific or undefined behavior is
throwing a dice.
Maybe John Reagan has some ideas about what may break, but I cannot see
VSI systematic document how Itanium to x86-64 migration will impact
C code with implementation specific or undefined behavior.
Arne
Simon, you asked this exact same question before. You can search back for the guesses.
Moving architectures can always expose bugs regardless of platform, OS, or compiler.
GEM has different optimizations on Alpha vs Itanium. Did you have to add any volatiles in that transition?
Linux people, how many people have code that works with -O0 and -O1 but not with -O3 or-Ofast?
The current cross-compilers are no-optimize so there is no real world experience for missing volatiles.
I would also ask why does seemingly every question have a negative bent toward OpenVMS?
Why is what LLVM does "correct' where potentially what OpenVMS does buggy?
Isn't it possible that a correct sequence on OpenVMS can reveal a bug in LLVM where an implementation is not correct in all cases?
absrv,

Absolutely. Many an optimizer has incorrectly optimized that which should not be optimized.

- Bob Gezelter, http://www.rlgsc.com
Dave Froble
2021-09-22 15:58:42 UTC
Permalink
Post by abrsvc
I would also ask why does seemingly every question have a negative bent toward OpenVMS?
Why is what LLVM does "correct' where potentially what OpenVMS does buggy?
Isn't it possible that a correct sequence on OpenVMS can reveal a bug in LLVM where an implementation is not correct in all cases?
Now Dan, are you trying to ruin Simon's fun?
--
David Froble Tel: 724-529-0450
Dave Froble Enterprises, Inc. E-Mail: ***@tsoft-inc.com
DFE Ultralights, Inc.
170 Grimplin Road
Vanderbilt, PA 15486
Simon Clubley
2021-09-22 17:54:40 UTC
Permalink
Post by abrsvc
I would also ask why does seemingly every question have a negative bent toward OpenVMS?
It doesn't. I've posted very positive comments about VMS clusters
and the cluster-wide DLM (for example) in the past. Even today, they
are still very strong features in VMS (ignoring the price tag. :-)).

However, experience in other operating systems causes me to see
missing features in VMS, at least some of which people might expect as
standard these days.

At a policy level, some negative statements I've made about the move
to time-limited production licences appear to be widely supported.
Post by abrsvc
Why is what LLVM does "correct' where potentially what OpenVMS does buggy?
Not buggy, just not as aggressive. LLVM is a much more aggressive optimiser
from what I can tell.

To give you an idea of what LLVM gets up to, this is the current list
of LLVM passes:

https://llvm.org/docs/Passes.html

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
Dave Froble
2021-09-22 15:56:51 UTC
Permalink
Post by John Reagan
Simon, you asked this exact same question before. You can search back for the guesses.
Are you suggesting Simon is getting a bit senile, or is just stubborn?
--
David Froble Tel: 724-529-0450
Dave Froble Enterprises, Inc. E-Mail: ***@tsoft-inc.com
DFE Ultralights, Inc.
170 Grimplin Road
Vanderbilt, PA 15486
Simon Clubley
2021-09-22 17:41:21 UTC
Permalink
Post by John Reagan
Simon, you asked this exact same question before. You can search back for the guesses.
I know. I was wondering if any data has come along since then.
Post by John Reagan
The current cross-compilers are no-optimize so there is no real world experience for missing volatiles.
That I had forgotten was still the case. Will be interesting to see
the performance improvements when you turn the optimiser on. :-)

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
Bob Gezelter
2021-09-22 14:26:01 UTC
Permalink
Post by Simon Clubley
Jan-Erik's questions about ASTs in COBOL have reminded me about something
I asked a while back.
VMS I/O and system calls are much more asynchronous than on other operating
systems and data can appear in buffers and variables in general can be
changed outside of the normal sequence points (such as at a function call
boundary).
With the move to LLVM, and its different optimiser, have any examples
appeared in VMS code for x86-64 where volatile attributes are now required
on variable definitions where you would have got away with not using them
before (even if technically, they should have been marked as volatile anyway) ?
Just curious if there's any places in code running on VMS x86-64 that will
need to cleaned up to do things in the correct way that you would have
got away with doing less correctly previously.
Simon
--
Walking destinations on a map are further away than they appear.
Simon,

Since the days of RSX-11M, I have been dealing with client bugs in this area.. The best phrasing I have seen in this area was in an IBM System/360 Principles of Operation manual. It may have only appeared in certain editions, as I cannot find the precise reference. However, it was along the lines of "the contents of a buffer are UNDEFINED [emphasis mine] from the initiation of the I/O operation until the operation has completed with the device end signal from the device."

In OpenVMS speak, the above translates as: "The contents of the buffer are undefined from the issuance of the QIO system call until such time as the I/O is completed, signaled by the queueing of an AST; setting of an event flag; or the setting of the completion code in the IOSB."

Hoff and I participated in a thread a ways back on a related topic. Out of order storing on Alpha requires an explicit flush of the pipeline to ensure that the IOSB, buffers, and other data is consistent when an AST is queued.

One violates that fundamental understanding at one's peril. (Yes, I have had clients try peeking at in-progress buffers, often with catastrophic results). There are absolutely no guarantees about the contents of a buffer while an I/O operation is queued or in process for the buffer.

- Bob Gezelter, http://www.rlgsc.com
Dave Froble
2021-09-22 16:03:45 UTC
Permalink
Post by John Reagan
Post by Simon Clubley
Jan-Erik's questions about ASTs in COBOL have reminded me about something
I asked a while back.
VMS I/O and system calls are much more asynchronous than on other operating
systems and data can appear in buffers and variables in general can be
changed outside of the normal sequence points (such as at a function call
boundary).
With the move to LLVM, and its different optimiser, have any examples
appeared in VMS code for x86-64 where volatile attributes are now required
on variable definitions where you would have got away with not using them
before (even if technically, they should have been marked as volatile anyway) ?
Just curious if there's any places in code running on VMS x86-64 that will
need to cleaned up to do things in the correct way that you would have
got away with doing less correctly previously.
Simon
--
Walking destinations on a map are further away than they appear.
Simon,
Since the days of RSX-11M, I have been dealing with client bugs in this area.. The best phrasing I have seen in this area was in an IBM System/360 Principles of Operation manual. It may have only appeared in certain editions, as I cannot find the precise reference. However, it was along the lines of "the contents of a buffer are UNDEFINED [emphasis mine] from the initiation of the I/O operation until the operation has completed with the device end signal from the device."
In OpenVMS speak, the above translates as: "The contents of the buffer are undefined from the issuance of the QIO system call until such time as the I/O is completed, signaled by the queueing of an AST; setting of an event flag; or the setting of the completion code in the IOSB."
Hoff and I participated in a thread a ways back on a related topic. Out of order storing on Alpha requires an explicit flush of the pipeline to ensure that the IOSB, buffers, and other data is consistent when an AST is queued.
One violates that fundamental understanding at one's peril. (Yes, I have had clients try peeking at in-progress buffers, often with catastrophic results). There are absolutely no guarantees about the contents of a buffer while an I/O operation is queued or in process for the buffer.
- Bob Gezelter, http://www.rlgsc.com
Gotta agree with that. Once some action is started, the buffer ain't
yours until the action is completed.
--
David Froble Tel: 724-529-0450
Dave Froble Enterprises, Inc. E-Mail: ***@tsoft-inc.com
DFE Ultralights, Inc.
170 Grimplin Road
Vanderbilt, PA 15486
Simon Clubley
2021-09-22 18:11:39 UTC
Permalink
Post by John Reagan
Simon,
Since the days of RSX-11M, I have been dealing with client bugs in this area.. The best phrasing I have seen in this area was in an IBM System/360 Principles of Operation manual. It may have only appeared in certain editions, as I cannot find the precise reference. However, it was along the lines of "the contents of a buffer are UNDEFINED [emphasis mine] from the initiation of the I/O operation until the operation has completed with the device end signal from the device."
In OpenVMS speak, the above translates as: "The contents of the buffer are undefined from the issuance of the QIO system call until such time as the I/O is completed, signaled by the queueing of an AST; setting of an event flag; or the setting of the completion code in the IOSB."
That isn't the concern Bob.

The concern is, given the highly asynchronous nature of VMS I/O and
of some VMS system calls in general, and given the more aggressive
LLVM optimiser, does the generated code always correctly re-read the
current contents of buffers and variables without having to mark those
buffers/variables as volatile ?

Or are there enough sequence points in VMS application code where these
buffers and variables are accessed that this may turn out not to be a
problem in most cases ?

In essence, the VMS system call and I/O system is behaving much more
like the kinds of things you see in embedded bare-metal programming
than in the normal synchronous model you see in the Unix world.

There's a reason why volatile is used so liberally in embedded bare-metal
programming. :-)

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
chris
2021-09-22 19:09:52 UTC
Permalink
Post by Simon Clubley
Post by John Reagan
Simon,
Since the days of RSX-11M, I have been dealing with client bugs in this area.. The best phrasing I have seen in this area was in an IBM System/360 Principles of Operation manual. It may have only appeared in certain editions, as I cannot find the precise reference. However, it was along the lines of "the contents of a buffer are UNDEFINED [emphasis mine] from the initiation of the I/O operation until the operation has completed with the device end signal from the device."
In OpenVMS speak, the above translates as: "The contents of the buffer are undefined from the issuance of the QIO system call until such time as the I/O is completed, signaled by the queueing of an AST; setting of an event flag; or the setting of the completion code in the IOSB."
That isn't the concern Bob.
The concern is, given the highly asynchronous nature of VMS I/O and
of some VMS system calls in general, and given the more aggressive
LLVM optimiser, does the generated code always correctly re-read the
current contents of buffers and variables without having to mark those
buffers/variables as volatile ?
Or are there enough sequence points in VMS application code where these
buffers and variables are accessed that this may turn out not to be a
problem in most cases ?
In essence, the VMS system call and I/O system is behaving much more
like the kinds of things you see in embedded bare-metal programming
than in the normal synchronous model you see in the Unix world.
There's a reason why volatile is used so liberally in embedded bare-metal
programming. :-)
Simon.
That sounds like bad code design to me and more an issue of critical
sections. For example, it's quite common to have an upper and lower io
half, with queues betwixt the two. Upper half being mainline code that
has access to and can update pointers, while low half at interrupt
level also has access to the queue and it's pointers. At trivial level,
interrupts are disabled during mainline access and if the interrupt
handler always runs to completion, that provides the critical section
locks.

What you seem to be suggesting is a race condition, where the state of
one section of code is unknown to the other, a sequence of parallel
states that somehow get out of sync, due to poor code design, sequence
points, whatever.


I'm sure the designers of vms wpuld be well aware of such issues,
steeped in computer science as they were, and an area which is
fundamental to most system design...

Chris
Simon Clubley
2021-09-22 20:16:19 UTC
Permalink
Post by chris
That sounds like bad code design to me and more an issue of critical
sections. For example, it's quite common to have an upper and lower io
half, with queues betwixt the two. Upper half being mainline code that
has access to and can update pointers, while low half at interrupt
level also has access to the queue and it's pointers. At trivial level,
interrupts are disabled during mainline access and if the interrupt
handler always runs to completion, that provides the critical section
locks.
It's nothing like that Chris.

At the level of talking to the kernel, all I/O on VMS is asynchronous
and it is actually a nice design. There is no such thing as synchronous
I/O at system call level on VMS.

When you queue an I/O in VMS, you can pass either an event flag number or
an AST completion routine to the sys$qio() call which then queues the
I/O for processing and then immediately returns to the application.

To put that another way, the sys$qio() I/O call is purely asynchronous.
Any decisions to wait for for I/O to complete are made in the application,
(for example via the sys$qiow() call) and not in the kernel.

You can stall by making a second system call to wait until the event
flag is set, or you can use sys$qiow() which is a helper routine to
do that for you, but you are not forced to and that is the critical
point.

You can queue the I/O and then just carry on doing something else
in your application while the I/O completes and then you are notified
in one of several ways.

That means the kernel can write _directly_ into your process space by
setting status variables and writing directly into your queued buffer
while the application is busy doing something else completely different.

You do not have to stall in a system call to actually receive the
buffer from the kernel - VMS writes it directly into your address space.

It is _exactly_ the same as embedded bare-metal programming where the
hardware can write directly into memory-mapped registers and buffers
in your program while you are busy doing something else.
Post by chris
What you seem to be suggesting is a race condition, where the state of
one section of code is unknown to the other, a sequence of parallel
states that somehow get out of sync, due to poor code design, sequence
points, whatever.
It is actually a very clean mechanism and there are no such things
as race conditions when using it properly.
Post by chris
I'm sure the designers of vms wpuld be well aware of such issues,
steeped in computer science as they were, and an area which is
fundamental to most system design...
They are, which is why the DEC-controlled compilers emitted code
that worked just fine with VMS without the application having to
use volatile.

However, LLVM is now the compiler toolkit in use and it could
potentially make quite valid (and different) assumptions about
if it needs to re-read a variable that it doesn't know has changed.

After all, if the application takes full advantage of this
asynchronous I/O model, there has been no direct call by the code
to actually receive the buffer and I/O completion status variables
when VMS decides to update them after the I/O has completed.

I am hoping however that there are enough sequence points in the
code, even in the VMS asynchronous I/O model for this not to be
a problem in practice although it is a potential problem.

Now do you see the potential problem ?

BTW, this also applies to some system calls in general as a number of
them are asynchronous as well - it's not just the I/O in VMS which
is asynchronous.

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
chris
2021-09-23 14:10:48 UTC
Permalink
Post by Simon Clubley
Post by chris
That sounds like bad code design to me and more an issue of critical
sections. For example, it's quite common to have an upper and lower io
half, with queues betwixt the two. Upper half being mainline code that
has access to and can update pointers, while low half at interrupt
level also has access to the queue and it's pointers. At trivial level,
interrupts are disabled during mainline access and if the interrupt
handler always runs to completion, that provides the critical section
locks.
It's nothing like that Chris.
At the level of talking to the kernel, all I/O on VMS is asynchronous
and it is actually a nice design. There is no such thing as synchronous
I/O at system call level on VMS.
When you queue an I/O in VMS, you can pass either an event flag number or
an AST completion routine to the sys$qio() call which then queues the
I/O for processing and then immediately returns to the application.
To put that another way, the sys$qio() I/O call is purely asynchronous.
Any decisions to wait for for I/O to complete are made in the application,
(for example via the sys$qiow() call) and not in the kernel.
You can stall by making a second system call to wait until the event
flag is set, or you can use sys$qiow() which is a helper routine to
do that for you, but you are not forced to and that is the critical
point.
You can queue the I/O and then just carry on doing something else
in your application while the I/O completes and then you are notified
in one of several ways.
That means the kernel can write _directly_ into your process space by
setting status variables and writing directly into your queued buffer
while the application is busy doing something else completely different.
You do not have to stall in a system call to actually receive the
buffer from the kernel - VMS writes it directly into your address space.
It is _exactly_ the same as embedded bare-metal programming where the
hardware can write directly into memory-mapped registers and buffers
in your program while you are busy doing something else.
Post by chris
What you seem to be suggesting is a race condition, where the state of
one section of code is unknown to the other, a sequence of parallel
states that somehow get out of sync, due to poor code design, sequence
points, whatever.
It is actually a very clean mechanism and there are no such things
as race conditions when using it properly.
So what is the issue here ?. Keywords like volatile would not normally
ever be used at app level, being reserved for low level kernel and
driver code where it touches real hardware registers. or perhaps
memory locations reserved for a specific purpose. The 4th edition of
Harbison & Steele C reference manual has 2 or 3 pages devoted to the
volatile keyword and might be worth looking at; Section 4.4.5. The
whole point of a black box kernel is to isolate the internal workings
from an application. Thinking layers, of course.

If systems and code are designed and written properly, then it should
compile to usable code irrespective of compiler, including
optimisation level, so long as the compiler is standards compliant.
Anything else is s system design issue...

Chris
Post by Simon Clubley
Post by chris
I'm sure the designers of vms wpuld be well aware of such issues,
steeped in computer science as they were, and an area which is
fundamental to most system design...
They are, which is why the DEC-controlled compilers emitted code
that worked just fine with VMS without the application having to
use volatile.
However, LLVM is now the compiler toolkit in use and it could
potentially make quite valid (and different) assumptions about
if it needs to re-read a variable that it doesn't know has changed.
After all, if the application takes full advantage of this
asynchronous I/O model, there has been no direct call by the code
to actually receive the buffer and I/O completion status variables
when VMS decides to update them after the I/O has completed.
I am hoping however that there are enough sequence points in the
code, even in the VMS asynchronous I/O model for this not to be
a problem in practice although it is a potential problem.
Now do you see the potential problem ?
BTW, this also applies to some system calls in general as a number of
them are asynchronous as well - it's not just the I/O in VMS which
is asynchronous.
Simon.
Dave Froble
2021-09-23 15:25:29 UTC
Permalink
Post by chris
So what is the issue here ?. Keywords like volatile would not normally
ever be used at app level, being reserved for low level kernel and
driver code where it touches real hardware registers. or perhaps
memory locations reserved for a specific purpose. The 4th edition of
Harbison & Steele C reference manual has 2 or 3 pages devoted to the
volatile keyword and might be worth looking at; Section 4.4.5. The
whole point of a black box kernel is to isolate the internal workings
from an application. Thinking layers, of course.
If systems and code are designed and written properly, then it should
compile to usable code irrespective of compiler, including
optimisation level, so long as the compiler is standards compliant.
Anything else is s system design issue...
If your code was executed as you intended, then there should not be any
issue, as you mention.

But what if your code is NOT executed as you intended. An optimizer
just might figure that it doesn't need to execute some instructions,
that they are redundant. They do that. However, if they are not
redundant, then results may not be as expected.

So, one might consider "volatile" (or whatever else is used) as an edict
to "don't optimize".
--
David Froble Tel: 724-529-0450
Dave Froble Enterprises, Inc. E-Mail: ***@tsoft-inc.com
DFE Ultralights, Inc.
170 Grimplin Road
Vanderbilt, PA 15486
chris
2021-09-23 16:22:10 UTC
Permalink
Post by Dave Froble
Post by chris
So what is the issue here ?. Keywords like volatile would not normally
ever be used at app level, being reserved for low level kernel and
driver code where it touches real hardware registers. or perhaps
memory locations reserved for a specific purpose. The 4th edition of
Harbison & Steele C reference manual has 2 or 3 pages devoted to the
volatile keyword and might be worth looking at; Section 4.4.5. The
whole point of a black box kernel is to isolate the internal workings
from an application. Thinking layers, of course.
If systems and code are designed and written properly, then it should
compile to usable code irrespective of compiler, including
optimisation level, so long as the compiler is standards compliant.
Anything else is s system design issue...
If your code was executed as you intended, then there should not be any
issue, as you mention.
But what if your code is NOT executed as you intended. An optimizer just
might figure that it doesn't need to execute some instructions, that
they are redundant. They do that. However, if they are not redundant,
then results may not be as expected.
So, one might consider "volatile" (or whatever else is used) as an edict
to "don't optimize".
Not really, as no code in a high level language should be written so
as to depend on the sequence of instructions generated. If you want
that defined more tightly, then you should be using assembler, or
have intimate knowledge of compiler translations and output.

One of the functions of a high level language is to provide an
abstraction layer between applications and the underlying machine.
Of course, that doesn't apply to systems or kernel programming but
such work requires a much deeper understanding of comp sci algorithmics
than simple application programming...

Chris
Simon Clubley
2021-09-23 17:58:08 UTC
Permalink
Post by chris
Not really, as no code in a high level language should be written so
as to depend on the sequence of instructions generated. If you want
that defined more tightly, then you should be using assembler, or
have intimate knowledge of compiler translations and output.
One of the functions of a high level language is to provide an
abstraction layer between applications and the underlying machine.
Of course, that doesn't apply to systems or kernel programming but
such work requires a much deeper understanding of comp sci algorithmics
than simple application programming...
Chris
Chris,

Are you familiar with true asynchronous I/O in normal applications
where the operating system can write directly into your address space
without the program having to wait around in a system call to actually
receive that data and while the program is actually busy doing something
else at the same time?

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
Lawrence D’Oliveiro
2021-09-24 01:29:32 UTC
Permalink
Post by Simon Clubley
Are you familiar with true asynchronous I/O in normal applications
where the operating system can write directly into your address space
without the program having to wait around in a system call to actually
receive that data and while the program is actually busy doing something
else at the same time?
We normally do that with threading or shared memory these days.
Stephen Hoffman
2021-09-24 03:22:58 UTC
Permalink
Post by Lawrence D’Oliveiro
Post by Simon Clubley
Are you familiar with true asynchronous I/O in normal applications
where the operating system can write directly into your address space
without the program having to wait around in a system call to actually
receive that data and while the program is actually busy doing
something else at the same time?
We normally do that with threading or shared memory these days.
At the level being discussed ($io_perform, $qio) all I/O is async by
default. The app queues the I/O request, and goes off to do...
whatever.

OpenVMS apps can use the sync call format—which is a sync call wrapping
around the underlying async call—and can then use ASTs or KP threads
for multi-threading, or multiple processes.

The I/O eventually completes, or eventually fails, loads the I/O status
block, and then triggers an event flag (often EFN$C_ENF "do not care")
and the AST async notification. The IOSB is always set before the EF or
the AST.

ASTs are in some ways a predecessor to a closure, and lack compiler
support and syntactic sugar such as the block syntax found in clang, or
the lambda syntax found in C++.

The DEC-traditional languages on OpenVMS sometimes have threading support.

(What I'm referring to as traditional: BASIC, Fortran, FORTRAN, Pascal,
COBOL, BLISS, Macro32, etc. I'm here ignoring Java, Python, Lua, and
whatnot, fine languages that those are.)

Older C (VAX C) had built-in parallelism support (which had some
issues), and newer C has pthreads POSIX threading support.

On OpenVMS, pthreads are built on KP Threads.

Language-based async/await is not something that was common in years
past, and the traditional OpenVMS compilers don't have support for
that, nor for newer standards were those syntax features have been
added.

Unix started out on a different path for I/O with largely sync calls
for I/O, and developed async support later (epoll, kpoll, select, aio,
pthreads, GSD, etc) to wrap around that.

select is a mess on OpenVMS, so we won't discuss that. aio and
GSD/libdispatch don't exist on OpenVMS. etc.

Here, the usual OpenVMS app pattern would be an AST-based app, or maybe
a threading app using pthreads or KP threads. The description I'd
posted earlier in the COBOL thread is also somewhat GSD-ish, given its
use of queues.

With ASTs, the app is either active in the mainline, or exactly one AST
is active. Threads are somewhat more complex, and threads can and do
operate entirely in parallel across multiple processors.

Both ASTs and threads require careful consideration of shared storage,
which ties back to Simon's threads on compiler code optimization, as
well as knowing the processor memory model.

Alpha in particular is very aggressive memory model, as compared to
pretty much any other architecture available. And the COBOL thread
involves Alpha.

It'd be possible to do all this in memory with a section and queues
too, but that then means adding notifications (signals, DLM lock
doorbells, $sigprc, etc) and eventually security and pretty soon most
of the overhead of mailboxes or sockets.

Rolling your own communications interface is absolutely possible and
was once fairly common. I've built and worked with more than a few
communications APIs commonly using sections. Yes, pun fully intended.

For most cases with newer app development work or overhauls on OpenVMS,
I'd tend to use sockets and not mailboxes (from over in the COBOL
thread), but that's local preference. Sockets can let me move
constituent apps further apart, should the app or server load
increases. As has happened with apps I've worked on, the alternative
tends to be mailboxes and sockets, which is more code and more
complexity. And some have included section-based and driver-based
comms. All that means more code, and more "fun" routing and logging and
troubleshooting.

Creating an app that's basically one big ball of self-requeuing ASTs
with a main that hibernates and wakes works pretty well for
low-to-moderate-scale OpenVMS apps, too.
--
Pure Personal Opinion | HoffmanLabs LLC
Lawrence D’Oliveiro
2021-09-25 06:01:22 UTC
Permalink
Post by Stephen Hoffman
At the level being discussed ($io_perform, $qio) all I/O is async by
default. The app queues the I/O request, and goes off to do...
whatever.
Coming from VMS, where I/O and process scheduling were inherently decoupled, I find the Unix way a step backwards in some ways. Linux has its “aio” framework, but that seems to be specifically for block devices, for use it seems by some DBMS implementors who don’t like to work through conventional filesystems.
Post by Stephen Hoffman
Language-based async/await is not something that was common in years
past ...
It’s just a revival of the old coroutine concept from decades past. Kind of. There is this terminology of “stackful” versus “stackless” coroutines, where the original kind was “stackful”. Async/await are described as “stackless” because they don’t need to switch entire stacks between tasks, since preemption can only occur at limited points. Perhaps more accurately described as “stack-light”, but there you go.
Post by Stephen Hoffman
select is a mess on OpenVMS, so we won't discuss that.
No “poll” or “epoll” ... ? “select” is considered a bit old-fashioned these days...
Post by Stephen Hoffman
Creating an app that's basically one big ball of self-requeuing ASTs
with a main that hibernates and wakes works pretty well for
low-to-moderate-scale OpenVMS apps, too.
I did that once, back in my MSc days. I also wrote my own threading package on top of ASTs, and tried reimplementing the app on top of that. Performance dropped by half.
Simon Clubley
2021-09-24 12:09:40 UTC
Permalink
Still doesn't explain why a volatiie keyword might be needed at
application level, though I guess there might be a few edge cases...
I'm surprised you are having a hard time seeing it Chris.

Hardware stuffs something directly into process memory outside of
the flow of execution of a program, hence volatile may be required
for some programs to tell the compiler to generate code to re-read
it again.

VMS stuffs something directly into process memory outside of
the flow of execution of a program, hence volatile may now be
required for some programs to tell the compiler to generate code
to re-read it again.

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
chris
2021-09-24 14:18:48 UTC
Permalink
Post by Simon Clubley
Still doesn't explain why a volatiie keyword might be needed at
application level, though I guess there might be a few edge cases...
I'm surprised you are having a hard time seeing it Chris.
Hardware stuffs something directly into process memory outside of
the flow of execution of a program, hence volatile may be required
for some programs to tell the compiler to generate code to re-read
it again.
Sorry, but that's incorrect. You are confusing compile time actions
with runtime situations. Present C compilers can have no
knowledge of future dynamic runtime situations where, for example, a
shared buffer may be updated asynchronously by separate processes
and at different times. However, most operating systems have features
to manage such situations to ensure things like mutual exclusion and
deadlock prevention. Os books are full of algorithms for that sort
of thing, as it's so fundamental to OS design.
Post by Simon Clubley
VMS stuffs something directly into process memory outside of
the flow of execution of a program, hence volatile may now be
required for some programs to tell the compiler to generate code
to re-read it again.
Simon.
Sorry, but wrong again and it does nothing of the sort. All the
volatile keyword does is to tell the compiler to disable
optimisation across sequence points, ie: eg not unroll loops, not
delete whole sections of apparently redundant code etc, but to
generate code as per the source defines. No added code is or can
be generated to take account of future run time situations.

Check out the Harbison & Steele book for 2 or 3 pages on the
volatile keyword...

Chris
Simon Clubley
2021-09-24 18:15:42 UTC
Permalink
On 2021-09-24, chris <chris-***@tridac.net> wrote:

Chris, you are even more stubborn than Arne. :-) (Sorry Arne :-))
Post by chris
Post by Simon Clubley
Still doesn't explain why a volatiie keyword might be needed at
application level, though I guess there might be a few edge cases...
I'm surprised you are having a hard time seeing it Chris.
Hardware stuffs something directly into process memory outside of
the flow of execution of a program, hence volatile may be required
for some programs to tell the compiler to generate code to re-read
it again.
Sorry, but that's incorrect. You are confusing compile time actions
with runtime situations. Present C compilers can have no
knowledge of future dynamic runtime situations where, for example, a
shared buffer may be updated asynchronously by separate processes
and at different times. However, most operating systems have features
to manage such situations to ensure things like mutual exclusion and
deadlock prevention. Os books are full of algorithms for that sort
of thing, as it's so fundamental to OS design.
No I am not. All I have said all along is that volatile inserts code
into the generated code to _always_ re-read the variable before doing
anything with it.

Some programs may not need that but only if they have not touched the
variable since the program started running (so the initial read is what
would be done anyway).

I have _never_ said that compilers have any knowledge of dynamic runtime
situations. Volatile guarantees an unconditional read before working
with data and that is all it does and that's how unknown situations are
handled.

BTW, what does the hardware (or the operating system) dropping data you
requested into your process memory space while you busy doing something
else have to do with mutual exclusion or deadlocks ?

If you have not done such things, you might want to try writing programs
to use sys$qio() in full async mode or try the Linux AIO stuff and you
may then see what my potential concerns are with the upcoming compiler
changes.
Post by chris
Post by Simon Clubley
VMS stuffs something directly into process memory outside of
the flow of execution of a program, hence volatile may now be
required for some programs to tell the compiler to generate code
to re-read it again.
Simon.
Sorry, but wrong again and it does nothing of the sort. All the
volatile keyword does is to tell the compiler to disable
optimisation across sequence points, ie: eg not unroll loops, not
delete whole sections of apparently redundant code etc, but to
generate code as per the source defines. No added code is or can
be generated to take account of future run time situations.
Telling the compiler to generate code to re-read it again is _exactly_
what volatile does.

And telling the compiler to add code to force a re-read of a variable
_is_ the way you take account of unknown future run time situations.

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
Dave Froble
2021-09-24 21:12:46 UTC
Permalink
Post by Simon Clubley
Chris, you are even more stubborn than Arne. :-) (Sorry Arne :-))
Maybe even as stubborn as Simon ???
--
David Froble Tel: 724-529-0450
Dave Froble Enterprises, Inc. E-Mail: ***@tsoft-inc.com
DFE Ultralights, Inc.
170 Grimplin Road
Vanderbilt, PA 15486
chris
2021-09-24 21:59:15 UTC
Permalink
Post by Simon Clubley
Chris, you are even more stubborn than Arne. :-) (Sorry Arne :-))
Post by chris
Post by Simon Clubley
Still doesn't explain why a volatiie keyword might be needed at
application level, though I guess there might be a few edge cases...
I'm surprised you are having a hard time seeing it Chris.
Hardware stuffs something directly into process memory outside of
the flow of execution of a program, hence volatile may be required
for some programs to tell the compiler to generate code to re-read
it again.
Sorry, but that's incorrect. You are confusing compile time actions
with runtime situations. Present C compilers can have no
knowledge of future dynamic runtime situations where, for example, a
shared buffer may be updated asynchronously by separate processes
and at different times. However, most operating systems have features
to manage such situations to ensure things like mutual exclusion and
deadlock prevention. Os books are full of algorithms for that sort
of thing, as it's so fundamental to OS design.
No I am not. All I have said all along is that volatile inserts code
into the generated code to _always_ re-read the variable before doing
anything with it.
Sorry, no it doesn't :-). All it's saying is that the section of code
should not be subject to any optimisations. Not the same thing at all.
Doesn't add code, just doesn't take any away, nor modify it,
but translates as written.

Of course in the real world, out of order execution on modern micros
can be a can of worms in itself, if the code depends on a specific
sequence of instruction execution, which of course, it never should.

Chris
Simon Clubley
2021-09-25 03:25:04 UTC
Permalink
Post by chris
Post by Simon Clubley
No I am not. All I have said all along is that volatile inserts code
into the generated code to _always_ re-read the variable before doing
anything with it.
Sorry, no it doesn't :-). All it's saying is that the section of code
should not be subject to any optimisations. Not the same thing at all.
Doesn't add code, just doesn't take any away, nor modify it,
but translates as written.
I may have phrased that a little loosely, but the end result is exactly
the same - the generated object code has unconditional reads in it in
places it might not have done had the optimiser been allowed to go to
work on the variable.

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
chris
2021-09-25 13:00:09 UTC
Permalink
Post by Simon Clubley
Post by chris
Post by Simon Clubley
No I am not. All I have said all along is that volatile inserts code
into the generated code to _always_ re-read the variable before doing
anything with it.
Sorry, no it doesn't :-). All it's saying is that the section of code
should not be subject to any optimisations. Not the same thing at all.
Doesn't add code, just doesn't take any away, nor modify it,
but translates as written.
I may have phrased that a little loosely, but the end result is exactly
the same - the generated object code has unconditional reads in it in
places it might not have done had the optimiser been allowed to go to
work on the variable.
Simon.
We are probably in agreement, just different interpretations of
the same thing ?. One thing I always do when confronted with a new
compiler or tool chain is to look at the assembler source output
to make sure it's doing what I expect it to. Don't bother once
i'm happy with the compiler, but it does help to get to know what
the compiler is doing under various conditions. It's also useful
if you are trying to optimise performance. For example, trying to
decide which loop construct to use for / next, or do / while. Quite
important in the old 8 bit days, but moderm micros are so good
now, it's less of an issue. Quite often a single line of asm
per C statement, but you can fine tune the programming style to'
get the best results form the compiler....

Chris
Simon Clubley
2021-09-25 18:46:02 UTC
Permalink
Post by chris
We are probably in agreement, just different interpretations of
the same thing ?. One thing I always do when confronted with a new
compiler or tool chain is to look at the assembler source output
to make sure it's doing what I expect it to. Don't bother once
i'm happy with the compiler, but it does help to get to know what
the compiler is doing under various conditions. It's also useful
if you are trying to optimise performance. For example, trying to
decide which loop construct to use for / next, or do / while. Quite
important in the old 8 bit days, but moderm micros are so good
now, it's less of an issue. Quite often a single line of asm
per C statement, but you can fine tune the programming style to'
get the best results form the compiler....
:-)

Looking at the generated code has proved interesting at times. :-)

The following Ada Issue is a direct result of me looking at someone's
problem on comp.lang.ada a number of years ago which was caused by
the code the Ada compiler had generated:

http://www.ada-auth.org/cgi-bin/cvsweb.cgi/ai12s/ai12-0128-1.txt?rev=1.15&raw=N

The following AI is also directly related to this:

http://www.ada-auth.org/cgi-bin/cvsweb.cgi/ai12s/ai12-0127-1.txt?rev=1.27&raw=N

[I am _not_ a member of the ARG or anything like that. I am just a
normal programmer who includes Ada in the list of languages I know.]

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
chris
2021-09-26 17:10:07 UTC
Permalink
Post by Simon Clubley
Post by chris
We are probably in agreement, just different interpretations of
the same thing ?. One thing I always do when confronted with a new
compiler or tool chain is to look at the assembler source output
to make sure it's doing what I expect it to. Don't bother once
i'm happy with the compiler, but it does help to get to know what
the compiler is doing under various conditions. It's also useful
if you are trying to optimise performance. For example, trying to
decide which loop construct to use for / next, or do / while. Quite
important in the old 8 bit days, but moderm micros are so good
now, it's less of an issue. Quite often a single line of asm
per C statement, but you can fine tune the programming style to'
get the best results form the compiler....
:-)
Looking at the generated code has proved interesting at times. :-)
The following Ada Issue is a direct result of me looking at someone's
problem on comp.lang.ada a number of years ago which was caused by
http://www.ada-auth.org/cgi-bin/cvsweb.cgi/ai12s/ai12-0128-1.txt?rev=1.15&raw=N
http://www.ada-auth.org/cgi-bin/cvsweb.cgi/ai12s/ai12-0127-1.txt?rev=1.27&raw=N
[I am _not_ a member of the ARG or anything like that. I am just a
normal programmer who includes Ada in the list of languages I know.]
Simon.
Interesting examples, but just how much of that was due to system
complexity ?.

I bought a couple of books on ada some years ago, but have never
worked with it. One of the problems I have with some tools and
languages is the lack of transparency through the whole chain.
For some of the projects worked on here and for me to be able to
be fully confident about the end result, I need to have complete
transparency from source file right down to the hardware. Anything
that might hide detais or tools that prevent that would be unusable.

Some things are just too clever for their own good, and don't care
how thoroughly they have been tested :-)...

Chris
Simon Clubley
2021-09-27 19:03:13 UTC
Permalink
Post by chris
Post by Simon Clubley
Looking at the generated code has proved interesting at times. :-)
The following Ada Issue is a direct result of me looking at someone's
problem on comp.lang.ada a number of years ago which was caused by
http://www.ada-auth.org/cgi-bin/cvsweb.cgi/ai12s/ai12-0128-1.txt?rev=1.15&raw=N
http://www.ada-auth.org/cgi-bin/cvsweb.cgi/ai12s/ai12-0127-1.txt?rev=1.27&raw=N
[I am _not_ a member of the ARG or anything like that. I am just a
normal programmer who includes Ada in the list of languages I know.]
Interesting examples, but just how much of that was due to system
complexity ?.
Not directly in this case IIRC. Ada is about modelling the problem
using a very strict type system and a very strong type system that
allows a problem to be modelled in detail.

However, in this case IIRC, the optimiser simply did something
unexpected when a bitfield within a record instead of the 32-bit
record itself (as would be the case in C) was updated even through
the 32-bit record was marked as Atomic.

As you can see, the suggested changes allow Ada to still be Ada but
also make sure the correct sized memory read and write occurs.

To be honest however, I'm surprised the issue wasn't discovered prior
to the problem report showing up in comp.lang.ada.

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
chris
2021-09-27 21:23:07 UTC
Permalink
Post by Simon Clubley
Post by chris
Post by Simon Clubley
Looking at the generated code has proved interesting at times. :-)
The following Ada Issue is a direct result of me looking at someone's
problem on comp.lang.ada a number of years ago which was caused by
http://www.ada-auth.org/cgi-bin/cvsweb.cgi/ai12s/ai12-0128-1.txt?rev=1.15&raw=N
http://www.ada-auth.org/cgi-bin/cvsweb.cgi/ai12s/ai12-0127-1.txt?rev=1.27&raw=N
[I am _not_ a member of the ARG or anything like that. I am just a
normal programmer who includes Ada in the list of languages I know.]
Interesting examples, but just how much of that was due to system
complexity ?.
Not directly in this case IIRC. Ada is about modelling the problem
using a very strict type system and a very strong type system that
allows a problem to be modelled in detail.
However, in this case IIRC, the optimiser simply did something
unexpected when a bitfield within a record instead of the 32-bit
record itself (as would be the case in C) was updated even through
the 32-bit record was marked as Atomic.
As you can see, the suggested changes allow Ada to still be Ada but
also make sure the correct sized memory read and write occurs.
To be honest however, I'm surprised the issue wasn't discovered prior
to the problem report showing up in comp.lang.ada.
Simon.
Sometimes bugs are discovered decades later, but usually something to
do with a corner case that is infrequently used. They say all software
has bugs, even the most proven and best maintained :-).

Program C in a fairly strict subset here. Never use bitfields at all,
which are not portable across architectures. Much better to use masks
or short lookup tables for bit level access...


Chris
Simon Clubley
2021-09-23 17:51:41 UTC
Permalink
Post by chris
So what is the issue here ?. Keywords like volatile would not normally
ever be used at app level, being reserved for low level kernel and
driver code where it touches real hardware registers. or perhaps
memory locations reserved for a specific purpose.
There are a number of instances where it is quite valid (and expected)
to use a volatile attribute in normal applications, especially when
asynchronous I/O is involved.

The use of volatile is not restricted to those who know the resistor
colour code chart off by heart. :-)

For example, Linux uses it (correctly) in its own asynchronous I/O interface:

https://man7.org/linux/man-pages/man7/aio.7.html
Post by chris
If systems and code are designed and written properly, then it should
compile to usable code irrespective of compiler, including
optimisation level, so long as the compiler is standards compliant.
Anything else is s system design issue...
In this case, the standards compliant approach would have been to require
the volatile attribute from day 1 on those variables/fields/buffers which
are filled in _after_ the system call has returned control to the program.

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
chris
2021-09-24 14:28:23 UTC
Permalink
If you look at that use of volatile, it's dealing with sig_atomic,
which I would guess to be an interface to a test and set instruction,
which is designed to be indivisible and non interuptable. That is,
the whole instruction always executes to completion.
More like driver level code, not application, where such
functionality would normally be encapsulated into a system call.
Volatile is also set (quite correctly) on the buffer itself.
Simon.
Why ?. While the compiler will typically pad out structures to align
each element to the machine wordsize, the use of volatile to define
that buffer looks redundant, since no optimisation would apply to
that structure definition anyway.

Use structure overlays for machine register access all the time here
and would never use the volatile keyword for any of it. Machine
register structure pointers, yes, volatile is appropriate and
necessary there...

Chris
Simon Clubley
2021-09-24 18:36:40 UTC
Permalink
Post by chris
If you look at that use of volatile, it's dealing with sig_atomic,
which I would guess to be an interface to a test and set instruction,
which is designed to be indivisible and non interuptable. That is,
the whole instruction always executes to completion.
More like driver level code, not application, where such
functionality would normally be encapsulated into a system call.
Volatile is also set (quite correctly) on the buffer itself.
Why ?. While the compiler will typically pad out structures to align
each element to the machine wordsize, the use of volatile to define
that buffer looks redundant, since no optimisation would apply to
that structure definition anyway.
Chris, what makes you think you know better than the people who
wrote that header ?

It's not the structure, but the data written into the buffer by
Linux behind the scenes that the volatile attribute is designed
to address.

The whole point of the AIO interface is that the data is written to
the buffer in your process by Linux while your program is busy doing
something else. In this case, it's behaving in exactly the same way
that sys$qio() is behaving.

You therefore have to force a re-read of the buffer when you later go
looking at it so the compiler doesn't think it can reuse an existing
(and now stale) value.

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
chris
2021-09-24 21:48:46 UTC
Permalink
Post by Simon Clubley
Post by chris
If you look at that use of volatile, it's dealing with sig_atomic,
which I would guess to be an interface to a test and set instruction,
which is designed to be indivisible and non interuptable. That is,
the whole instruction always executes to completion.
More like driver level code, not application, where such
functionality would normally be encapsulated into a system call.
Volatile is also set (quite correctly) on the buffer itself.
Why ?. While the compiler will typically pad out structures to align
each element to the machine wordsize, the use of volatile to define
that buffer looks redundant, since no optimisation would apply to
that structure definition anyway.
Chris, what makes you think you know better than the people who
wrote that header ?
While I don't stubbornly claim to be always right, have spent over
three decades programming real time embedded on a variety of RTOS
platforms.

They put a volatile tag onto a void pointer, which can be
cast to pointer to any type, but still doesn't need the volatile
tag.
Post by Simon Clubley
It's not the structure, but the data written into the buffer by
Linux behind the scenes that the volatile attribute is designed
to address.
The whole point of the AIO interface is that the data is written to
the buffer in your process by Linux while your program is busy doing
something else. In this case, it's behaving in exactly the same way
that sys$qio() is behaving.
Shared memory regions are quite common in everyday code, asynchronously
updated or not, as the DMA example I outlined earlier. Perhaps you are
not explaining what you mean very well ?.
Post by Simon Clubley
You therefore have to force a re-read of the buffer when you later go
looking at it so the compiler doesn't think it can reuse an existing
(and now stale) value.
All i'm saying is, read the C standard docs on the use of the volatile
keyword for more info, or do you think you know better ?...

Chris
Dave Froble
2021-09-24 23:01:55 UTC
Permalink
Post by chris
or do you think you know better ?...
Chris
Come on Chris, this is Simon you're arguing with. Did you really need
to ask that?

:-)
--
David Froble Tel: 724-529-0450
Dave Froble Enterprises, Inc. E-Mail: ***@tsoft-inc.com
DFE Ultralights, Inc.
170 Grimplin Road
Vanderbilt, PA 15486
Dave Froble
2021-09-25 02:15:01 UTC
Permalink
Post by Dave Froble
Post by chris
or do you think you know better ?...
Chris
Come on Chris, this is Simon you're arguing with. Did you really need
to ask that?
:-)
ROTFLMFAO!
I thought you'd enjoy that.

I was laughing so hard, I almost could not type it.

:-)

Humor aside, I think I'm agreeing with Simon on this topic, even if I
know just about nothing about optimizers.
--
David Froble Tel: 724-529-0450
Dave Froble Enterprises, Inc. E-Mail: ***@tsoft-inc.com
DFE Ultralights, Inc.
170 Grimplin Road
Vanderbilt, PA 15486
Simon Clubley
2021-09-25 03:42:36 UTC
Permalink
Post by chris
Post by Simon Clubley
You therefore have to force a re-read of the buffer when you later go
looking at it so the compiler doesn't think it can reuse an existing
(and now stale) value.
All i'm saying is, read the C standard docs on the use of the volatile
keyword for more info, or do you think you know better ?...
How do you otherwise _guarantee_ that the Linux application program
is seeing the latest data that the Linux kernel might have written
into the buffer behind the scenes since the program last looked at
the buffer ?

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
chris
2021-09-25 10:48:47 UTC
Permalink
Post by Simon Clubley
Post by chris
Post by Simon Clubley
You therefore have to force a re-read of the buffer when you later go
looking at it so the compiler doesn't think it can reuse an existing
(and now stale) value.
All i'm saying is, read the C standard docs on the use of the volatile
keyword for more info, or do you think you know better ?...
How do you otherwise _guarantee_ that the Linux application program
is seeing the latest data that the Linux kernel might have written
into the buffer behind the scenes since the program last looked at
the buffer ?
Simon.
Most kernels have system calls to deal with that sort of thing, to
create and manage locks on shared resources and to ensure mutual
exclusion. The key thing is that that is a high level thing, whereas
things like volatile are a compile time mechanism. If you like,
the low level support foundation for high level lock mechanisms.

Others may have a better explanation of all this...

Chris
Simon Clubley
2021-09-25 18:29:24 UTC
Permalink
Post by chris
Post by Simon Clubley
Post by chris
Post by Simon Clubley
You therefore have to force a re-read of the buffer when you later go
looking at it so the compiler doesn't think it can reuse an existing
(and now stale) value.
All i'm saying is, read the C standard docs on the use of the volatile
keyword for more info, or do you think you know better ?...
How do you otherwise _guarantee_ that the Linux application program
is seeing the latest data that the Linux kernel might have written
into the buffer behind the scenes since the program last looked at
the buffer ?
Most kernels have system calls to deal with that sort of thing, to
create and manage locks on shared resources and to ensure mutual
exclusion. The key thing is that that is a high level thing, whereas
things like volatile are a compile time mechanism. If you like,
the low level support foundation for high level lock mechanisms.
Others may have a better explanation of all this...
No explanation needed as I do understand those things.

However, we were talking instead about why the AIO implementation
on Linux uses the volatile attribute on its transfer buffer.

Perhaps if you play with the Linux AIO interface and especially with
the sys$qio() system call in full async mode, you might understand
why I am saying the things I am.

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
chris
2021-09-26 16:54:11 UTC
Permalink
Post by Simon Clubley
Post by chris
Post by Simon Clubley
Post by chris
Post by Simon Clubley
You therefore have to force a re-read of the buffer when you later go
looking at it so the compiler doesn't think it can reuse an existing
(and now stale) value.
All i'm saying is, read the C standard docs on the use of the volatile
keyword for more info, or do you think you know better ?...
How do you otherwise _guarantee_ that the Linux application program
is seeing the latest data that the Linux kernel might have written
into the buffer behind the scenes since the program last looked at
the buffer ?
Most kernels have system calls to deal with that sort of thing, to
create and manage locks on shared resources and to ensure mutual
exclusion. The key thing is that that is a high level thing, whereas
things like volatile are a compile time mechanism. If you like,
the low level support foundation for high level lock mechanisms.
Others may have a better explanation of all this...
No explanation needed as I do understand those things.
However, we were talking instead about why the AIO implementation
on Linux uses the volatile attribute on its transfer buffer.
Yes, an i'm suggesting that it's redundant. Structures or their
contents are not modified in any way by the compiler, other than
possible padding out element spacing to the natural machine
wordsize. So again, why is that buffer pointer declared with
the volatile keyword ?.
Post by Simon Clubley
Perhaps if you play with the Linux AIO interface and especially with
the sys$qio() system call in full async mode, you might understand
why I am saying the things I am.
It's still not clear to me, so in the spirit of teamwork, why
don't you explain that use of volatile, in depth, so we can
all understand it ?.
Post by Simon Clubley
The aio facility provides system calls for asynchronous I/O.
Asynchronous I/O operations are not completed synchronously
by the calling thread. Instead, the calling thread invokes
one system call to request an asynchronous I/O operation.
The status of a completed request is retrieved later via a
separate system call.
Key point there is: not completed synchronously by the calling
thread...

Chris
Simon Clubley
2021-09-27 18:49:31 UTC
Permalink
Post by chris
Post by Simon Clubley
Post by chris
Post by Simon Clubley
Post by chris
Post by Simon Clubley
You therefore have to force a re-read of the buffer when you later go
looking at it so the compiler doesn't think it can reuse an existing
(and now stale) value.
All i'm saying is, read the C standard docs on the use of the volatile
keyword for more info, or do you think you know better ?...
How do you otherwise _guarantee_ that the Linux application program
is seeing the latest data that the Linux kernel might have written
into the buffer behind the scenes since the program last looked at
the buffer ?
Most kernels have system calls to deal with that sort of thing, to
create and manage locks on shared resources and to ensure mutual
exclusion. The key thing is that that is a high level thing, whereas
things like volatile are a compile time mechanism. If you like,
the low level support foundation for high level lock mechanisms.
Others may have a better explanation of all this...
No explanation needed as I do understand those things.
However, we were talking instead about why the AIO implementation
on Linux uses the volatile attribute on its transfer buffer.
Yes, an i'm suggesting that it's redundant. Structures or their
contents are not modified in any way by the compiler, other than
possible padding out element spacing to the natural machine
wordsize. So again, why is that buffer pointer declared with
the volatile keyword ?.
Because the buffer behind the pointer gets directly written to by the
operating system outside the normal flow of execution while the program
is busy doing something else.

At that point, any cached contents of that buffer become invalid
and need to be re-read. Volatile guarantees that will happen and _if_
the code walks through a sequence point before reading the modified
buffer that should cause it to happen as well (ignoring possible
optimiser bugs :-)).

However, volatile, which has no downsides other than a few extra
cycles, will _guarantee_ that will happen without you having to worry
about if you have gone through a sequence point or if the optimiser
has done something unexpected.

This is no different from hardware writing directly into your memory
space from outside the normal flow of execution of your code.

For the record, I always mark any buffers and data structures used in
_truly_ asynchronous operations in applications as volatile these days
to avoid any unexpected issues. I consider it an example of robust
programming that avoids a whole range of potential problems.

I'm also more worried about sys$qio() in async mode and the other VMS
async system calls than in AIO as I used AIO as a second example of
asynchronous I/O. The volatile attribute on the AIO buffer is correct,
but given the variety of options in VMS, there's much more potential
there for "funny stuff" :-) to happen in VMS.
Post by chris
Post by Simon Clubley
Perhaps if you play with the Linux AIO interface and especially with
the sys$qio() system call in full async mode, you might understand
why I am saying the things I am.
It's still not clear to me, so in the spirit of teamwork, why
don't you explain that use of volatile, in depth, so we can
all understand it ?.
I've explained above. It's not about the structure the pointer is a
part of. It's about the buffer behind the pointer.
Post by chris
Post by Simon Clubley
The aio facility provides system calls for asynchronous I/O.
Asynchronous I/O operations are not completed synchronously
by the calling thread. Instead, the calling thread invokes
one system call to request an asynchronous I/O operation.
The status of a completed request is retrieved later via a
separate system call.
Key point there is: not completed synchronously by the calling
thread...
I've just checked some FreeBSD headers available online and it
turns out that AIO on FreeBSD also marks the transfer buffer
as volatile.

That's two different operating systems so whoever designed this
interface was clearly worried about the possibility that compiler
optimisation could get in the way of this working correctly in
some cases.

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
chris
2021-09-27 21:39:07 UTC
Permalink
Post by Simon Clubley
Because the buffer behind the pointer gets directly written to by the
operating system outside the normal flow of execution while the program
is busy doing something else.
Yes,. it does, but the compiler has no knowledge of that at all, as it's
a run time thing, not compile time. Unless the compiler has a crystal
ball :-). Can have no idea of what underlying code might modify the
buffer.
Post by Simon Clubley
At that point, any cached contents of that buffer become invalid
and need to be re-read. Volatile guarantees that will happen and _if_
the code walks through a sequence point before reading the modified
buffer that should cause it to happen as well (ignoring possible
optimiser bugs :-)).
No, volatile does no such thing. It's merely a hint to the complier to
apply no optimisation to it.
Post by Simon Clubley
However, volatile, which has no downsides other than a few extra
cycles, will _guarantee_ that will happen without you having to worry
about if you have gone through a sequence point or if the optimiser
has done something unexpected.
This is no different from hardware writing directly into your memory
space from outside the normal flow of execution of your code.
Quite common in interrupt handlers, but again, the compiler can't
predict the future as to when it will be modified, or by whom,
so how can it add code to mitigate what it can't understand ?.
Post by Simon Clubley
For the record, I always mark any buffers and data structures used in
_truly_ asynchronous operations in applications as volatile these days
to avoid any unexpected issues. I consider it an example of robust
programming that avoids a whole range of potential problems.
I'm also more worried about sys$qio() in async mode and the other VMS
async system calls than in AIO as I used AIO as a second example of
asynchronous I/O. The volatile attribute on the AIO buffer is correct,
but given the variety of options in VMS, there's much more potential
there for "funny stuff" :-) to happen in VMS.
Post by Simon Clubley
Perhaps if you play with the Linux AIO interface and especially with
the sys$qio() system call in full async mode, you might understand
why I am saying the things I am.
That's how this started, right, VMS sys$qio having bugs internally ?.
Post by Simon Clubley
I've just checked some FreeBSD headers available online and it
turns out that AIO on FreeBSD also marks the transfer buffer
as volatile.
Well, there is a lot of cross fertilisation between the two OS. The
only reason I can think of for the pointer to be declared with
a volatile keyword is that if the buffer is declared volatile, then
the point also needs to be to avoid whinges about incompatible types.
Can see no reason for it otherwise...

Chris
Dave Froble
2021-09-27 22:57:16 UTC
Permalink
Post by chris
Yes,. it does, but the compiler has no knowledge of that at all, as it's
a run time thing, not compile time. Unless the compiler has a crystal
ball :-). Can have no idea of what underlying code might modify the
buffer.
Obviously true.

This whole discussion isn't, or at least should not be, about compilers.
Perhaps interpreters, but let's not add more confusion.
Post by chris
No, volatile does no such thing. It's merely a hint to the complier to
apply no optimisation to it.
Correct.
Post by chris
Quite common in interrupt handlers, but again, the compiler can't
predict the future as to when it will be modified, or by whom,
so how can it add code to mitigate what it can't understand ?.
I don't think that it does.
Post by chris
That's how this started, right, VMS sys$qio having bugs internally ?.
Not that I recall. The original issue was optimization on x86, if I
remember correctly. No guarantee of that.
--
David Froble Tel: 724-529-0450
Dave Froble Enterprises, Inc. E-Mail: ***@tsoft-inc.com
DFE Ultralights, Inc.
170 Grimplin Road
Vanderbilt, PA 15486
chris
2021-09-27 23:30:04 UTC
Permalink
Post by Dave Froble
Post by chris
Yes,. it does, but the compiler has no knowledge of that at all, as it's
a run time thing, not compile time. Unless the compiler has a crystal
ball :-). Can have no idea of what underlying code might modify the
buffer.
Obviously true.
This whole discussion isn't, or at least should not be, about compilers.
Perhaps interpreters, but let's not add more confusion.
Post by chris
No, volatile does no such thing. It's merely a hint to the complier to
apply no optimisation to it.
Correct.
Post by chris
Quite common in interrupt handlers, but again, the compiler can't
predict the future as to when it will be modified, or by whom,
so how can it add code to mitigate what it can't understand ?.
I don't think that it does.
Post by chris
That's how this started, right, VMS sys$qio having bugs internally ?.
Not that I recall. The original issue was optimization on x86, if I
remember correctly. No guarantee of that.
Losing marbles :-). Anyway, if the code is written right, it shouldn't
matter what optimisation, if any, is applied. That's before we even get
into such niceties such as out of order execution, reordering etc, all
of which might need to be disabled for any code where strict execution
order is required.

Thankfully, never had to deal with that, but all kinds of possible
googlies with modern processors. Perhaps that's what Simon was hinting
at ?...

Chris
Simon Clubley
2021-09-28 18:32:15 UTC
Permalink
Post by chris
Post by Simon Clubley
Because the buffer behind the pointer gets directly written to by the
operating system outside the normal flow of execution while the program
is busy doing something else.
Yes,. it does, but the compiler has no knowledge of that at all, as it's
a run time thing, not compile time. Unless the compiler has a crystal
ball :-). Can have no idea of what underlying code might modify the
buffer.
We have been through this several times Chris.

The whole point of volatile is that the compiler doesn't need to know.
When the variable _is_ accessed again in the code, the read is left in
instead of possibly being deleted by the optimiser.
Post by chris
Post by Simon Clubley
At that point, any cached contents of that buffer become invalid
and need to be re-read. Volatile guarantees that will happen and _if_
the code walks through a sequence point before reading the modified
buffer that should cause it to happen as well (ignoring possible
optimiser bugs :-)).
No, volatile does no such thing. It's merely a hint to the complier to
apply no optimisation to it.
And how is the hint implemented ? Answer: by not optimising away what
the optimiser would otherwise think may be a redundant read.

So yes, it's exactly how it works.
Post by chris
Post by Simon Clubley
However, volatile, which has no downsides other than a few extra
cycles, will _guarantee_ that will happen without you having to worry
about if you have gone through a sequence point or if the optimiser
has done something unexpected.
This is no different from hardware writing directly into your memory
space from outside the normal flow of execution of your code.
Quite common in interrupt handlers, but again, the compiler can't
predict the future as to when it will be modified, or by whom,
so how can it add code to mitigate what it can't understand ?.
The whole point of volatile is that it doesn't need to predict the
future. The variable is always re-read before the contents are used
for something and as I clarified a few messages back, it's not about
adding code, but instead about not removing code the optimiser would
otherwise think is redundant.
Post by chris
That's how this started, right, VMS sys$qio having bugs internally ?.
No. The concern was that people might not be correctly marking buffers
and data structures as volatile when using system services in full async
mode and that might catch them with a compiler with a different optimiser
(ie: LLVM).
Post by chris
Post by Simon Clubley
I've just checked some FreeBSD headers available online and it
turns out that AIO on FreeBSD also marks the transfer buffer
as volatile.
Well, there is a lot of cross fertilisation between the two OS. The
only reason I can think of for the pointer to be declared with
a volatile keyword is that if the buffer is declared volatile, then
the point also needs to be to avoid whinges about incompatible types.
Can see no reason for it otherwise...
It's not a volatile pointer Chris, it's a normal pointer to
a volatile buffer. The only thing that's volatile is the buffer
behind the pointer (which is how it should be). The pointer itself
is not volatile.

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
chris
2021-10-02 13:33:37 UTC
Permalink
Post by Simon Clubley
The whole point of volatile is that the compiler doesn't need to know.
When the variable _is_ accessed again in the code, the read is left in
instead of possibly being deleted by the optimiser.
A buffer would never be deleted by an optimiser, but if you think so,
please explain.
Post by Simon Clubley
The whole point of volatile is that it doesn't need to predict the
future. The variable is always re-read before the contents are used
for something and as I clarified a few messages back, it's not about
adding code, but instead about not removing code the optimiser would
otherwise think is redundant.
So tell me, what exactly "rereads" the variable, code added by the
compiler, or what ?
Post by Simon Clubley
No. The concern was that people might not be correctly marking buffers
and data structures as volatile when using system services in full async
mode and that might catch them with a compiler with a different optimiser
(ie: LLVM).
Perhaps, but you can give no good reason why that might be the case,
or technical argument to support such a theory.
Post by Simon Clubley
Post by chris
Post by Simon Clubley
I've just checked some FreeBSD headers available online and it
turns out that AIO on FreeBSD also marks the transfer buffer
as volatile.
Well, there is a lot of cross fertilisation between the two OS. The
only reason I can think of for the pointer to be declared with
a volatile keyword is that if the buffer is declared volatile, then
the point also needs to be to avoid whinges about incompatible types.
Can see no reason for it otherwise...
It's not a volatile pointer Chris, it's a normal pointer to
a volatile buffer. The only thing that's volatile is the buffer
behind the pointer (which is how it should be). The pointer itself
is not volatile.
No, but the point I was making that the pointer is declared volatile
to avoid complier whinges about incompatible types, if you
had read what I said. If something is declared volatile, then the
compiler will take note, whether it needs to or not, but will have
no effect on the resulting code.

See a lot of use of volatile where it's not needed, because so many
do not understand it's use. Does no harm, but shows lack of
understanding...

Chris
Simon Clubley
2021-10-02 16:37:27 UTC
Permalink
Post by chris
Post by Simon Clubley
The whole point of volatile is that the compiler doesn't need to know.
When the variable _is_ accessed again in the code, the read is left in
instead of possibly being deleted by the optimiser.
A buffer would never be deleted by an optimiser, but if you think so,
please explain.
Until now Chris, I wasn't sure if you were trolling or were yet another
hardware type who didn't understand software as well as they liked to
think they did, so I gave you the benefit of the doubt.

Given the blatant way you have now replied to something very different
from what I said and given the way you have ignored elsewhere what
I said, it is now clear you are trolling.

I will however, give you this last response.

You have asked above about buffers being deleted, which is something
that has not been discussed until you raised it just now. That's a
perfect example of why I now think you are trolling as your reply
was about something that was never said.

As you should be aware of, if you understand this as well as you claim,
at generated code level, a read can be either a standalone read or it
can be part of a read/modify/write sequence.

In either case, all the optimiser does is decide if it can delete the
read part of that because it thinks it has the value already cached
and hence a memory read is not required.
Post by chris
Post by Simon Clubley
The whole point of volatile is that it doesn't need to predict the
future. The variable is always re-read before the contents are used
for something and as I clarified a few messages back, it's not about
adding code, but instead about not removing code the optimiser would
otherwise think is redundant.
So tell me, what exactly "rereads" the variable, code added by the
compiler, or what ?
The read as part of the read/modify/write sequence or the standalone
read which in either case was not deleted by the optimiser because the
optimiser was told not to delete it.
Post by chris
Post by Simon Clubley
No. The concern was that people might not be correctly marking buffers
and data structures as volatile when using system services in full async
mode and that might catch them with a compiler with a different optimiser
(ie: LLVM).
Perhaps, but you can give no good reason why that might be the case,
or technical argument to support such a theory.
Well, actually I did a bit of digging and it turns out the VMS programming
concepts manual, even as far back as VMS 7.3-1, recommends the volatile
attribute in some circumstances when calling some system services.

Here's an example in Pascal:

http://odl.sysworks.biz/disk$axpdocdec021/opsys/vmsos731/vmsos731/5841/5841pro_055.html

What I am worried about is just an extension of that.
Post by chris
Post by Simon Clubley
Post by chris
Post by Simon Clubley
I've just checked some FreeBSD headers available online and it
turns out that AIO on FreeBSD also marks the transfer buffer
as volatile.
Well, there is a lot of cross fertilisation between the two OS. The
only reason I can think of for the pointer to be declared with
a volatile keyword is that if the buffer is declared volatile, then
the point also needs to be to avoid whinges about incompatible types.
Can see no reason for it otherwise...
It's not a volatile pointer Chris, it's a normal pointer to
a volatile buffer. The only thing that's volatile is the buffer
behind the pointer (which is how it should be). The pointer itself
is not volatile.
No, but the point I was making that the pointer is declared volatile
to avoid complier whinges about incompatible types, if you
had read what I said. If something is declared volatile, then the
compiler will take note, whether it needs to or not, but will have
no effect on the resulting code.
The pointer itself isn't volatile. It's the buffer behind that pointer
which is volatile. In C, those are two different things. Here's some
material I've just found which explains the difference:

https://stackoverflow.com/questions/9935190/why-is-a-point-to-volatile-pointer-like-volatile-int-p-useful
Post by chris
See a lot of use of volatile where it's not needed, because so many
do not understand it's use. Does no harm, but shows lack of
understanding...
Chris
Or it could be considered an example of robust programming so that
someone doesn't have to worry that they have got the need for it
wrong. It's no different from doing boundary checks in your code
manually at runtime in languages that don't do automatic boundary
checks even though you are "sure" :-) that you wrote the code
correctly to begin with.

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
Dave Froble
2021-10-02 19:14:00 UTC
Permalink
Post by Simon Clubley
Post by chris
Post by Simon Clubley
The whole point of volatile is that the compiler doesn't need to know.
When the variable _is_ accessed again in the code, the read is left in
instead of possibly being deleted by the optimiser.
A buffer would never be deleted by an optimiser, but if you think so,
please explain.
Until now Chris, I wasn't sure if you were trolling or were yet another
hardware type who didn't understand software as well as they liked to
think they did, so I gave you the benefit of the doubt.
Given the blatant way you have now replied to something very different
from what I said and given the way you have ignored elsewhere what
I said, it is now clear you are trolling.
I will however, give you this last response.
You have asked above about buffers being deleted, which is something
that has not been discussed until you raised it just now. That's a
perfect example of why I now think you are trolling as your reply
was about something that was never said.
As you should be aware of, if you understand this as well as you claim,
at generated code level, a read can be either a standalone read or it
can be part of a read/modify/write sequence.
In either case, all the optimiser does is decide if it can delete the
read part of that because it thinks it has the value already cached
and hence a memory read is not required.
Post by chris
Post by Simon Clubley
The whole point of volatile is that it doesn't need to predict the
future. The variable is always re-read before the contents are used
for something and as I clarified a few messages back, it's not about
adding code, but instead about not removing code the optimiser would
otherwise think is redundant.
So tell me, what exactly "rereads" the variable, code added by the
compiler, or what ?
The read as part of the read/modify/write sequence or the standalone
read which in either case was not deleted by the optimiser because the
optimiser was told not to delete it.
Post by chris
Post by Simon Clubley
No. The concern was that people might not be correctly marking buffers
and data structures as volatile when using system services in full async
mode and that might catch them with a compiler with a different optimiser
(ie: LLVM).
Perhaps, but you can give no good reason why that might be the case,
or technical argument to support such a theory.
Well, actually I did a bit of digging and it turns out the VMS programming
concepts manual, even as far back as VMS 7.3-1, recommends the volatile
attribute in some circumstances when calling some system services.
http://odl.sysworks.biz/disk$axpdocdec021/opsys/vmsos731/vmsos731/5841/5841pro_055.html
What I am worried about is just an extension of that.
Post by chris
Post by Simon Clubley
Post by chris
Post by Simon Clubley
I've just checked some FreeBSD headers available online and it
turns out that AIO on FreeBSD also marks the transfer buffer
as volatile.
Well, there is a lot of cross fertilisation between the two OS. The
only reason I can think of for the pointer to be declared with
a volatile keyword is that if the buffer is declared volatile, then
the point also needs to be to avoid whinges about incompatible types.
Can see no reason for it otherwise...
It's not a volatile pointer Chris, it's a normal pointer to
a volatile buffer. The only thing that's volatile is the buffer
behind the pointer (which is how it should be). The pointer itself
is not volatile.
No, but the point I was making that the pointer is declared volatile
to avoid complier whinges about incompatible types, if you
had read what I said. If something is declared volatile, then the
compiler will take note, whether it needs to or not, but will have
no effect on the resulting code.
The pointer itself isn't volatile. It's the buffer behind that pointer
which is volatile. In C, those are two different things. Here's some
https://stackoverflow.com/questions/9935190/why-is-a-point-to-volatile-pointer-like-volatile-int-p-useful
Post by chris
See a lot of use of volatile where it's not needed, because so many
do not understand it's use. Does no harm, but shows lack of
understanding...
Chris
Or it could be considered an example of robust programming so that
someone doesn't have to worry that they have got the need for it
wrong. It's no different from doing boundary checks in your code
manually at runtime in languages that don't do automatic boundary
checks even though you are "sure" :-) that you wrote the code
correctly to begin with.
Simon.
Ok, poll time. Is there anyone else that took as long as Simon to
realize he was being trolled?

:-)
--
David Froble Tel: 724-529-0450
Dave Froble Enterprises, Inc. E-Mail: ***@tsoft-inc.com
DFE Ultralights, Inc.
170 Grimplin Road
Vanderbilt, PA 15486
Simon Clubley
2021-10-02 20:28:23 UTC
Permalink
Post by Dave Froble
Ok, poll time. Is there anyone else that took as long as Simon to
realize he was being trolled?
:-)
:-)

In Simon's defence, he would like to point out that he strongly
suspected that a good couple of rounds ago, and if Chris was not
a hardware type, he would have probably called it as a troll at
that point.

Unfortunately, some hardware types have an over-inflated opinion
of their level of knowledge of software issues, so it was quite
possible Chris really didn't understand the points being made.

My problem is that I always seem to want to help educate people... :-)

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
chris
2021-10-02 23:38:12 UTC
Permalink
Post by Simon Clubley
Post by Dave Froble
Ok, poll time. Is there anyone else that took as long as Simon to
realize he was being trolled?
:-)
:-)
In Simon's defence, he would like to point out that he strongly
suspected that a good couple of rounds ago, and if Chris was not
a hardware type, he would have probably called it as a troll at
that point.
No, really trolling, just saw your original post as arm waving
in respect of some mysterious process, as yet undiscovered, that
could cause havoc to vms builds. As I said upthread, vms has
been rewritten twice already, so sure such issues would have been
noticed and dealt with.
Post by Simon Clubley
Unfortunately, some hardware types have an over-inflated opinion
of their level of knowledge of software issues, so it was quite
possible Chris really didn't understand the points being made.
Yes, and some people can be arrogant and dismissive. Present
company excluded, of course :-).
Post by Simon Clubley
My problem is that I always seem to want to help educate people... :-)
That's good, but a clear explanation of the detail always helps :-)...

Chris
Post by Simon Clubley
Simon.
Stephen Hoffman
2021-10-03 00:32:37 UTC
Permalink
No, really trolling, just saw your original post as arm waving in
respect of some mysterious process, as yet undiscovered, that could
cause havoc to vms builds. As I said upthread, vms has been rewritten
twice already, so sure such issues would have been noticed and dealt
with.
OpenVMS has been ported, yes. OpenVMS has not been rewritten twice.
Each port further isolates the platform-independent from the
platform-specific too, while the vast majority of the existing source
code is unchanged.

Previous ports have shown weird bugs, those latent in the existing
codebase that was ported, those bugs introduced by the platform tools,
and those exposed by (for instance) updated processors of the same
architecture. q.v. SRM_CHECK.

Some enhancements and optimizations have exposed latent bugs in the
platform and in apps, too. The async I/O completion work from (as it
was then known) VAX/VMS 5.0 field test exposed myriad latent errors,
and was backed out for the V5.0 release.

Hyrum's Law: With a sufficient number of users of an API, it does not
matter what you promise in the contract: all observable behaviors of
your system will be depended on by somebody."

And some of us would be interested in some Arm waving too, but for now
it's all x86-64 waving.
--
Pure Personal Opinion | HoffmanLabs LLC
chris
2021-10-03 12:35:45 UTC
Permalink
No, really trolling, just saw your original post as arm waving in
respect of some mysterious process, as yet undiscovered, that could
cause havoc to vms builds. As I said upthread, vms has been rewritten
twice already, so sure such issues would have been noticed and dealt
with.
OpenVMS has been ported, yes. OpenVMS has not been rewritten twice. Each
port further isolates the platform-independent from the
platform-specific too, while the vast majority of the existing source
code is unchanged.
Previous ports have shown weird bugs, those latent in the existing
codebase that was ported, those bugs introduced by the platform tools,
and those exposed by (for instance) updated processors of the same
architecture. q.v. SRM_CHECK.
Some enhancements and optimizations have exposed latent bugs in the
platform and in apps, too. The async I/O completion work from (as it was
then known) VAX/VMS 5.0 field test exposed myriad latent errors, and was
backed out for the V5.0 release.
Perhaps rewritten was the wrong word, so ported ?. Even so, would
suspect a lot of the low leve stuff (which this thread is about) must
have been rewritten as part of the porting effort.
Hyrum's Law: With a sufficient number of users of an API, it does not
matter what you promise in the contract: all observable behaviors of
your system will be depended on by somebody."
And some of us would be interested in some Arm waving too, but for now
it's all x86-64 waving.
Very much so, but needs an available standard platform in volume to make
it worthwhile for sw vendors to get behind it...

Really hard to break the X86 stranglehold...

Chris
Stephen Hoffman
2021-10-03 17:10:52 UTC
Permalink
Post by chris
Post by Stephen Hoffman
Previous ports have shown weird bugs, those latent in the existing
codebase that was ported, those bugs introduced by the platform tools,
and those exposed by (for instance) updated processors of the same
architecture. q.v. SRM_CHECK.
Perhaps rewritten was the wrong word, so ported ?. Even so, would
suspect a lot of the low leve stuff (which this thread is about) must
have been rewritten as part of the porting effort.
Ah, moving the goalposts.

The lowest reaches—a small but gnarly part of any operating system
platform—is different, yes. Hardware config and boot and errors, memory
management, interlocking, code generation, etc. Alas, bugs are not
constrained to those lowest reaches.

BTW, EV6 broke latent (and architecturally incorrect) code-generation
assumptions within the GEM compiler code generator. Assumptions that
had worked fine on OpenVMS and on Alpha processors prior to EV6. This
code-generation error is what SRM_CHECK detected. This old EV6
code-generation bug is entirely reminiscent of what Simon is concerned
about with x86-64 code generation, too.
Post by chris
Post by Stephen Hoffman
Hyrum's Law: With a sufficient number of users of an API, it does not
matter what you promise in the contract: all observable behaviors of
your system will be depended on by somebody."
And some of us would be interested in some Arm waving too, but for now
it's all x86-64 waving.
Very much so, but needs an available standard platform in volume to
make it worthwhile for sw vendors to get behind it...
Hyrum applies to most any complex system, software or hardware. It's
part of what can make changing OpenVMS or any other upward-compatible
platform so entertaining. A cluster-related change to the queue manager
job ID allocation algorithm blew up a number of apps, for instance.
Lots of other examples.
Post by chris
Really hard to break the X86 stranglehold...
As with past market shifts, shifts usually arise from below. Not from
above. Shifts can progress slowly for a decade or two. Then can
progress much more quickly.

As for x86-64, the ~fourth largest computer vendor by revenue (2020)
has invested heavily in and is migrating all of its laptops and
desktops off of x86-64 processors. Whether and how many customers might
accept that?

Are we in a shift? Donno. Maybe. Intel would certainly prefer not.
Microsoft too would probably prefer not too, their previous boutique
platform port investments, and their more recent Windows 11 ARM64 and
ARM64EC and related efforts, aside.

But as for OpenVMS, yes, getting native compilers working and reliable
and with decently-performing app code being generated is the
prerequisite. OpenVMS usually ends up needing some architectural tweaks
or extensions or other workarounds too, and some of those have been
mentioned by VSI.

On the plus side for the upcoming and entirely hypothetical OpenVMS Arm
port, LLVM already generates code compliant with Armv8-A and Armv9-A
designs.

Trivia: VSI has previously discussed an Arm port, though not in some
years. VSI seemingly have several more years before the current x86-64
port work, and VSI and third-party and end-user apps, are all
sufficiently available and are entering volume production on x86-64
servers. We should have a better idea of what is happening with x86-64
and Arm by then, too.
--
Pure Personal Opinion | HoffmanLabs LLC
chris
2021-10-02 23:06:51 UTC
Permalink
Post by Simon Clubley
Until now Chris, I wasn't sure if you were trolling or were yet another
hardware type who didn't understand software as well as they liked to
think they did, so I gave you the benefit of the doubt.
You are obviously missing the irony, since your original post at
thread start looked like trolling, casting aspersions as it did, as
though the vms team would miss something like that ?.
Post by Simon Clubley
With the move to LLVM, and its different optimiser, have any examples
appeared in VMS code for x86-64 where volatile attributes are now required
on variable definitions where you would have got away with not using them
before (even if technically, they should have been marked as volatile anyway)
VMS has been rewritten twice already and i'm sure they must have run
into such issues in the past.
Post by Simon Clubley
As you should be aware of, if you understand this as well as you claim,
at generated code level, a read can be either a standalone read or it
can be part of a read/modify/write sequence.
In either case, all the optimiser does is decide if it can delete the
read part of that because it thinks it has the value already cached
and hence a memory read is not required.
Either register, as you suggest, or go back out to memory. Helps to be
specific about what you mean.
Post by Simon Clubley
Post by Simon Clubley
The whole point of volatile is that it doesn't need to predict the
future. The variable is always re-read before the contents are used
for something and as I clarified a few messages back, it's not about
adding code, but instead about not removing code the optimiser would
otherwise think is redundant.
So, it really comes down to optimiser choice of instruction, depending
on the use of the volatile keyword. Perhaps optimisers might optimise
such reads out, but that common ?.
Post by Simon Clubley
Post by Simon Clubley
No. The concern was that people might not be correctly marking buffers
and data structures as volatile when using system services in full async
mode and that might catch them with a compiler with a different optimiser
(ie: LLVM).
I would assume that the vms folks know what they are doing, respect
for the assumed knowledge and past attention to detail etc.
Post by Simon Clubley
Post by Simon Clubley
It's not a volatile pointer Chris, it's a normal pointer to
a volatile buffer. The only thing that's volatile is the buffer
behind the pointer (which is how it should be). The pointer itself
is not volatile.
No, the pointer has been declared volatile, for the reason I gave
earlier, that it points to a volatile declared object.
Post by Simon Clubley
The pointer itself isn't volatile. It's the buffer behind that pointer
which is volatile. In C, those are two different things. Here's some
https://stackoverflow.com/questions/9935190/why-is-a-point-to-volatile-pointer-like-volatile-int-p-useful
A good example.
Post by Simon Clubley
Or it could be considered an example of robust programming so that
someone doesn't have to worry that they have got the need for it
wrong. It's no different from doing boundary checks in your code
manually at runtime in languages that don't do automatic boundary
checks even though you are "sure" :-) that you wrote the code
correctly to begin with.
At least the discussion stays civil in this group and the result
is often that we check our assumptions and learn something new :-)...

Chris
Jan-Erik Söderholm
2021-10-03 09:21:30 UTC
Permalink
Post by chris
So, it really comes down to optimiser choice of instruction, depending
on the use of the volatile keyword. Perhaps optimisers might optimise
such reads out, but that common ?.
Optimizing out code not needed is one of the most common things that
optimizers does. A very short example.

int a;
int b;
void main() {
a = 10;
b = a;
b = a;
}

The second assignment to b will simply be removed since it is
a duplicate of the first and the value of a has not changed, at
least not as far as the compiler can see.

Compare with:

int volatile a;
int b;
void main() {
a = 10;
b = a;
b = a;
}

This tells the compiler that the value of a can change outside of
the current compile unit and both assignments will be left to be
done and the value of b could be different after each assignment.

The first generates 7 Alpha instructions and the second 9 instructions.

Now, this was a very simple example, and a would probably point to some
common area in memory shared by multiple processes, or to some hardware
register that is changed outside of any code on the system. Or, as in the
example from Simon, a buffer that is written to by the I/O sub-system.

And no, I do not think that you are trolling, I think that you simply
do not understand what Simon is trying to say. Either just some simple
misunderstanding, or maybe you just to not understand the concept
of "volatile" in this context.
Bill Gunshannon
2021-10-03 14:21:28 UTC
Permalink
Post by Jan-Erik Söderholm
Post by chris
So, it really comes down to optimiser choice of instruction, depending
on the use of the volatile keyword. Perhaps optimisers might optimise
such reads out, but that common ?.
Optimizing out code not needed is one of the most common things that
optimizers does. A very short example.
int a;
int b;
void main() {
a = 10;
b = a;
b = a;
}
The second assignment to b will simply be removed since it is
a duplicate of the first and the value of a has not changed, at
least not as far as the compiler can see.
int volatile a;
int b;
void main() {
a = 10;
b = a;
b = a;
}
This tells the compiler that the value of a can change outside of
the current compile unit and both assignments will be left to be
done and the value of b could be different after each assignment.
The first generates 7 Alpha instructions and the second 9 instructions.
Would be interesting to see instructions generated, so we can see what
is happening. Instruction count alone doesn't say much.
More interesting than you might thnk. Here's an x86 version.

.text
.globl main
.type main, @function
main:
.LFB0:
.cfi_startproc
movl $10, a(%rip)
movl $10, b(%rip)
ret
.cfi_endproc
.LFE0:
.size main, .-main
.comm b,4,4
.comm a,4,4
.ident "GCC: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0"
.section .note.GNU-stack,"",@progbits

---------------------------------------

.text
.globl main
.type main, @function
main:
.LFB0:
.cfi_startproc
movl $10, a(%rip)
movl a(%rip), %eax
movl a(%rip), %eax
movl %eax, b(%rip)
ret
.cfi_endproc
.LFE0:
.size main, .-main
.comm b,4,4
.comm a,4,4
.ident "GCC: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0"
.section .note.GNU-stack,"",@progbits

------------------------------------------

And the diff:

< movl $10, b(%rip)
---
movl a(%rip), %eax
movl a(%rip), %eax
movl %eax, b(%rip)
Most interesting.

bill
Jan-Erik Söderholm
2021-10-03 15:16:04 UTC
Permalink
Post by Jan-Erik Söderholm
Post by chris
So, it really comes down to optimiser choice of instruction, depending
on the use of the volatile keyword. Perhaps optimisers might optimise
such reads out, but that common ?.
Optimizing out code not needed is one of the most common things that
optimizers does. A very short example.
int a;
int b;
void main() {
a = 10;
b = a;
b = a;
}
The second assignment to b will simply be removed since it is
a duplicate of the first and the value of a has not changed, at
least not as far as the compiler can see.
int volatile a;
int b;
void main() {
a = 10;
b = a;
b = a;
}
This tells the compiler that the value of a can change outside of
the current compile unit and both assignments will be left to be
done and the value of b could be different after each assignment.
The first generates 7 Alpha instructions and the second 9 instructions.
Would be interesting to see instructions generated, so we can see what
is happening. Instruction count alone doesn't say much.
More interesting than you might thnk.  Here's an x86 version.
    .text
    .globl    main
    .cfi_startproc
    movl    $10, a(%rip)
    movl    $10, b(%rip)
    ret
    .cfi_endproc
    .size    main, .-main
    .comm    b,4,4
    .comm    a,4,4
    .ident    "GCC: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0"
---------------------------------------
    .text
    .globl    main
    .cfi_startproc
    movl    $10, a(%rip)
    movl    a(%rip), %eax
    movl    a(%rip), %eax
    movl    %eax, b(%rip)
    ret
    .cfi_endproc
    .size    main, .-main
    .comm    b,4,4
    .comm    a,4,4
    .ident    "GCC: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0"
------------------------------------------
<     movl    $10, b(%rip)
---
     movl    a(%rip), %eax
     movl    a(%rip), %eax
     movl    %eax, b(%rip)
Most interesting.
bill
OK. since b isn't used for anything between the two assignments,
the first movl is removed. But the final value is still frmo the
second read of a, anyway.

Try with using b for something in between the assignments.

int a;
int b;
int c, d;
void main() {
a = 10;
b = a;
c = b;
b = a; (Probably removed by the optimizer...)
d = b;
}

int volatile a;
int b;
int c, d;
void main() {
a = 10;
b = a;
c = b;
b = a;
d = b;
}

Anyway, it is obvious that "volatile" does change the way the
compiler/optimizer builds the code. As expected, of course.
Bill Gunshannon
2021-10-03 15:29:49 UTC
Permalink
Post by Jan-Erik Söderholm
Post by Jan-Erik Söderholm
Post by chris
So, it really comes down to optimiser choice of instruction, depending
on the use of the volatile keyword. Perhaps optimisers might optimise
such reads out, but that common ?.
Optimizing out code not needed is one of the most common things that
optimizers does. A very short example.
int a;
int b;
void main() {
a = 10;
b = a;
b = a;
}
The second assignment to b will simply be removed since it is
a duplicate of the first and the value of a has not changed, at
least not as far as the compiler can see.
int volatile a;
int b;
void main() {
a = 10;
b = a;
b = a;
}
This tells the compiler that the value of a can change outside of
the current compile unit and both assignments will be left to be
done and the value of b could be different after each assignment.
The first generates 7 Alpha instructions and the second 9 instructions.
Would be interesting to see instructions generated, so we can see what
is happening. Instruction count alone doesn't say much.
More interesting than you might thnk.  Here's an x86 version.
     .text
     .globl    main
     .cfi_startproc
     movl    $10, a(%rip)
     movl    $10, b(%rip)
     ret
     .cfi_endproc
     .size    main, .-main
     .comm    b,4,4
     .comm    a,4,4
     .ident    "GCC: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0"
---------------------------------------
     .text
     .globl    main
     .cfi_startproc
     movl    $10, a(%rip)
     movl    a(%rip), %eax
     movl    a(%rip), %eax
     movl    %eax, b(%rip)
     ret
     .cfi_endproc
     .size    main, .-main
     .comm    b,4,4
     .comm    a,4,4
     .ident    "GCC: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0"
------------------------------------------
<     movl    $10, b(%rip)
---
 >     movl    a(%rip), %eax
 >     movl    a(%rip), %eax
 >     movl    %eax, b(%rip)
Most interesting.
bill
OK. since b isn't used for anything between the two assignments,
the first movl is removed. But the final value is still frmo the
second read of a, anyway.
Try with using b for something in between the assignments.
int a;
int b;
int c, d;
void main() {
  a = 10;
  b = a;
  c = b;
  b = a;  (Probably removed by the optimizer...)
  d = b;
}
int volatile a;
int b;
int c, d;
void main() {
  a = 10;
  b = a;
  c = b;
  b = a;
  d = b;
}
Anyway, it is obvious that "volatile" does change the way the
compiler/optimizer builds the code. As expected, of course.
I thought the funnier part was in the first one where the
optimizer decided that moving a to b was not necessary and
it merely assigned the constant value 10 to both of them.

bill
chris
2021-10-03 15:51:28 UTC
Permalink
Post by Bill Gunshannon
Post by Jan-Erik Söderholm
Post by Bill Gunshannon
Post by Jan-Erik Söderholm
Post by chris
So, it really comes down to optimiser choice of instruction, depending
on the use of the volatile keyword. Perhaps optimisers might optimise
such reads out, but that common ?.
Optimizing out code not needed is one of the most common things that
optimizers does. A very short example.
int a;
int b;
void main() {
a = 10;
b = a;
b = a;
}
The second assignment to b will simply be removed since it is
a duplicate of the first and the value of a has not changed, at
least not as far as the compiler can see.
int volatile a;
int b;
void main() {
a = 10;
b = a;
b = a;
}
This tells the compiler that the value of a can change outside of
the current compile unit and both assignments will be left to be
done and the value of b could be different after each assignment.
The first generates 7 Alpha instructions and the second 9
instructions.
Would be interesting to see instructions generated, so we can see what
is happening. Instruction count alone doesn't say much.
More interesting than you might thnk. Here's an x86 version.
.text
.globl main
.cfi_startproc
movl $10, a(%rip)
movl $10, b(%rip)
ret
.cfi_endproc
.size main, .-main
.comm b,4,4
.comm a,4,4
.ident "GCC: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0"
---------------------------------------
.text
.globl main
.cfi_startproc
movl $10, a(%rip)
movl a(%rip), %eax
movl a(%rip), %eax
movl %eax, b(%rip)
ret
.cfi_endproc
.size main, .-main
.comm b,4,4
.comm a,4,4
.ident "GCC: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0"
------------------------------------------
< movl $10, b(%rip)
---
movl a(%rip), %eax
movl a(%rip), %eax
movl %eax, b(%rip)
Most interesting.
bill
OK. since b isn't used for anything between the two assignments,
the first movl is removed. But the final value is still frmo the
second read of a, anyway.
Try with using b for something in between the assignments.
int a;
int b;
int c, d;
void main() {
a = 10;
b = a;
c = b;
b = a; (Probably removed by the optimizer...)
d = b;
}
int volatile a;
int b;
int c, d;
void main() {
a = 10;
b = a;
c = b;
b = a;
d = b;
}
Anyway, it is obvious that "volatile" does change the way the
compiler/optimizer builds the code. As expected, of course.
I thought the funnier part was in the first one where the
optimizer decided that moving a to b was not necessary and
it merely assigned the constant value 10 to both of them.
bill
Immediate adressing mode is usually faster the reg indirect,
or mode where the cpu has to evaluate an address expression...

Chris
Dave Froble
2021-10-03 18:18:18 UTC
Permalink
Post by Bill Gunshannon
Post by Jan-Erik Söderholm
Post by Bill Gunshannon
Post by Jan-Erik Söderholm
Post by chris
So, it really comes down to optimiser choice of instruction, depending
on the use of the volatile keyword. Perhaps optimisers might optimise
such reads out, but that common ?.
Optimizing out code not needed is one of the most common things that
optimizers does. A very short example.
int a;
int b;
void main() {
a = 10;
b = a;
b = a;
}
The second assignment to b will simply be removed since it is
a duplicate of the first and the value of a has not changed, at
least not as far as the compiler can see.
int volatile a;
int b;
void main() {
a = 10;
b = a;
b = a;
}
This tells the compiler that the value of a can change outside of
the current compile unit and both assignments will be left to be
done and the value of b could be different after each assignment.
The first generates 7 Alpha instructions and the second 9
instructions.
Would be interesting to see instructions generated, so we can see what
is happening. Instruction count alone doesn't say much.
More interesting than you might thnk. Here's an x86 version.
.text
.globl main
.cfi_startproc
movl $10, a(%rip)
movl $10, b(%rip)
ret
.cfi_endproc
.size main, .-main
.comm b,4,4
.comm a,4,4
.ident "GCC: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0"
---------------------------------------
.text
.globl main
.cfi_startproc
movl $10, a(%rip)
movl a(%rip), %eax
movl a(%rip), %eax
movl %eax, b(%rip)
ret
.cfi_endproc
.size main, .-main
.comm b,4,4
.comm a,4,4
.ident "GCC: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0"
------------------------------------------
< movl $10, b(%rip)
---
movl a(%rip), %eax
movl a(%rip), %eax
movl %eax, b(%rip)
Most interesting.
bill
OK. since b isn't used for anything between the two assignments,
the first movl is removed. But the final value is still frmo the
second read of a, anyway.
Try with using b for something in between the assignments.
int a;
int b;
int c, d;
void main() {
a = 10;
b = a;
c = b;
b = a; (Probably removed by the optimizer...)
d = b;
}
int volatile a;
int b;
int c, d;
void main() {
a = 10;
b = a;
c = b;
b = a;
d = b;
}
Anyway, it is obvious that "volatile" does change the way the
compiler/optimizer builds the code. As expected, of course.
I thought the funnier part was in the first one where the
optimizer decided that moving a to b was not necessary and
it merely assigned the constant value 10 to both of them.
bill
What I find a bit funny is all an optimizer does is corrects poorly
written code. People get into habits, and it can be hard to break some
habits.

As an example, back in the day, the RSTS Basic+ interpreter had an
interesting quirk. For example:

A = "abc"
B = A

One would expect the value in A to be placed in the location where the
pointer to B points. However, Basic+ would change the pointer to B to
the value of the pointer to A, thus losing the old location of B. I
think this happened with strings, don't really remember, it has been a
very long time.

Now consider that the pointer to B was in an I/O buffer. After the
operation, B would no longer be pointing into the I/O buffer. Perhaps
not such a good thing. (Actually a horrible thing!)

What people learned to do is:

B = A + ""

That would insure moving of the data, not the pointer. With BP2 and
later, this odd operation was not an issue. But, even today, I find
people still using that old habit of appending the null string to an
operation.

Bad habits are hard to kill.
--
David Froble Tel: 724-529-0450
Dave Froble Enterprises, Inc. E-Mail: ***@tsoft-inc.com
DFE Ultralights, Inc.
170 Grimplin Road
Vanderbilt, PA 15486
Lawrence D’Oliveiro
2021-10-03 23:15:46 UTC
Permalink
Post by Dave Froble
As an example, back in the day, the RSTS Basic+ interpreter had an
A = "abc"
B = A
One would expect the value in A to be placed in the location where the
pointer to B points. However, Basic+ would change the pointer to B to
the value of the pointer to A, thus losing the old location of B. I
think this happened with strings, don't really remember, it has been a
very long time.
Now consider that the pointer to B was in an I/O buffer. After the
operation, B would no longer be pointing into the I/O buffer. Perhaps
not such a good thing. (Actually a horrible thing!)
Isn’t that what LSET and RSET were for?
Dave Froble
2021-10-04 01:43:24 UTC
Permalink
Post by Lawrence D’Oliveiro
Post by Dave Froble
As an example, back in the day, the RSTS Basic+ interpreter had an
A = "abc"
B = A
One would expect the value in A to be placed in the location where the
pointer to B points. However, Basic+ would change the pointer to B to
the value of the pointer to A, thus losing the old location of B. I
think this happened with strings, don't really remember, it has been a
very long time.
Now consider that the pointer to B was in an I/O buffer. After the
operation, B would no longer be pointing into the I/O buffer. Perhaps
not such a good thing. (Actually a horrible thing!)
Isn’t that what LSET and RSET were for?
Yes. Not much used anymore.
--
David Froble Tel: 724-529-0450
Dave Froble Enterprises, Inc. E-Mail: ***@tsoft-inc.com
DFE Ultralights, Inc.
170 Grimplin Road
Vanderbilt, PA 15486
Lawrence D’Oliveiro
2021-10-04 07:04:37 UTC
Permalink
Post by Lawrence D’Oliveiro
Post by Dave Froble
As an example, back in the day, the RSTS Basic+ interpreter had an
A = "abc"
B = A
One would expect the value in A to be placed in the location where the
pointer to B points. However, Basic+ would change the pointer to B to
the value of the pointer to A, thus losing the old location of B. I
think this happened with strings, don't really remember, it has been a
very long time.
Now consider that the pointer to B was in an I/O buffer. After the
operation, B would no longer be pointing into the I/O buffer. Perhaps
not such a good thing. (Actually a horrible thing!)
Isn’t that what LSET and RSET were for?
Yes. Not much used anymore.
But it is the correct solution, within the BASIC-PLUS family at least, to the problem of updating a string within an existing buffer, is it not? Rather than resorting to non-obvious hacky magic tricks like concatenating a null string.
Dave Froble
2021-10-04 16:21:04 UTC
Permalink
Post by Lawrence D’Oliveiro
Post by Lawrence D’Oliveiro
Post by Dave Froble
As an example, back in the day, the RSTS Basic+ interpreter had an
A = "abc"
B = A
One would expect the value in A to be placed in the location where the
pointer to B points. However, Basic+ would change the pointer to B to
the value of the pointer to A, thus losing the old location of B. I
think this happened with strings, don't really remember, it has been a
very long time.
Now consider that the pointer to B was in an I/O buffer. After the
operation, B would no longer be pointing into the I/O buffer. Perhaps
not such a good thing. (Actually a horrible thing!)
Isn’t that what LSET and RSET were for?
Yes. Not much used anymore.
But it is the correct solution, within the BASIC-PLUS family at least, to the problem of updating a string within an existing buffer, is it not? Rather than resorting to non-obvious hacky magic tricks like concatenating a null string.
Well, I don't know, it is not an issue, and hasn't been for a long time.

The point I was making is people still using the +"" thing when not
needed, and quite likely something an optimizer might omit. In at least
some cases, optimization might be for poor programming practices.
--
David Froble Tel: 724-529-0450
Dave Froble Enterprises, Inc. E-Mail: ***@tsoft-inc.com
DFE Ultralights, Inc.
170 Grimplin Road
Vanderbilt, PA 15486
chris
2021-10-03 23:32:24 UTC
Permalink
Post by Dave Froble
What I find a bit funny is all an optimizer does is corrects poorly
written code. People get into habits, and it can be hard to break some
habits.
As an example, back in the day, the RSTS Basic+ interpreter had an
A = "abc"
B = A
One would expect the value in A to be placed in the location where the
pointer to B points. However, Basic+ would change the pointer to B to
the value of the pointer to A, thus losing the old location of B. I
think this happened with strings, don't really remember, it has been a
very long time.
Now consider that the pointer to B was in an I/O buffer. After the
operation, B would no longer be pointing into the I/O buffer. Perhaps
not such a good thing. (Actually a horrible thing!)
B = A + ""
That would insure moving of the data, not the pointer. With BP2 and
later, this odd operation was not an issue. But, even today, I find
people still using that old habit of appending the null string to an
operation.
Bad habits are hard to kill.
Interesting. Never familiar with Basic, but looks like an empty
string is being concatenated onto the existing string and forcing
a read of the original and subsequent write of B, ensuring
consistency ? So what happens if the null string is omitted ?...

Chris
Dave Froble
2021-10-04 01:44:52 UTC
Permalink
Post by chris
Post by Dave Froble
What I find a bit funny is all an optimizer does is corrects poorly
written code. People get into habits, and it can be hard to break some
habits.
As an example, back in the day, the RSTS Basic+ interpreter had an
A = "abc"
B = A
One would expect the value in A to be placed in the location where the
pointer to B points. However, Basic+ would change the pointer to B to
the value of the pointer to A, thus losing the old location of B. I
think this happened with strings, don't really remember, it has been a
very long time.
Now consider that the pointer to B was in an I/O buffer. After the
operation, B would no longer be pointing into the I/O buffer. Perhaps
not such a good thing. (Actually a horrible thing!)
B = A + ""
That would insure moving of the data, not the pointer. With BP2 and
later, this odd operation was not an issue. But, even today, I find
people still using that old habit of appending the null string to an
operation.
Bad habits are hard to kill.
Interesting. Never familiar with Basic, but looks like an empty
string is being concatenated onto the existing string and forcing
a read of the original and subsequent write of B, ensuring
consistency ? So what happens if the null string is omitted ?...
Chris
Now, nothing unusual. That was a Basic+ quirk. All compiled Basic does
the right thing.
--
David Froble Tel: 724-529-0450
Dave Froble Enterprises, Inc. E-Mail: ***@tsoft-inc.com
DFE Ultralights, Inc.
170 Grimplin Road
Vanderbilt, PA 15486
Simon Clubley
2021-10-04 06:32:31 UTC
Permalink
Post by Dave Froble
What I find a bit funny is all an optimizer does is corrects poorly
written code. People get into habits, and it can be hard to break some
habits.
Optimisers do quite a bit more than that. :-)
Post by Dave Froble
As an example, back in the day, the RSTS Basic+ interpreter had an
A = "abc"
B = A
One would expect the value in A to be placed in the location where the
pointer to B points. However, Basic+ would change the pointer to B to
the value of the pointer to A, thus losing the old location of B. I
think this happened with strings, don't really remember, it has been a
very long time.
That's known as copy by reference instead of copy by value and has
nothing to do with the optimiser. This is part of the language
semantics and each language makes its own decisions in this area.
Post by Dave Froble
Now consider that the pointer to B was in an I/O buffer. After the
operation, B would no longer be pointing into the I/O buffer. Perhaps
not such a good thing. (Actually a horrible thing!)
B = A + ""
That would insure moving of the data, not the pointer. With BP2 and
later, this odd operation was not an issue. But, even today, I find
people still using that old habit of appending the null string to an
operation.
What that does is to create a custom string and point to it, thereby
giving you the same end result as a copy by value.

In some languages, you can force a copy by value on the original
variable (and then you are into all the shallow copy versus deep
copy fun. :-))

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
Dave Froble
2021-10-04 16:23:05 UTC
Permalink
Post by Simon Clubley
Post by Dave Froble
What I find a bit funny is all an optimizer does is corrects poorly
written code. People get into habits, and it can be hard to break some
habits.
Optimisers do quite a bit more than that. :-)
Post by Dave Froble
As an example, back in the day, the RSTS Basic+ interpreter had an
A = "abc"
B = A
One would expect the value in A to be placed in the location where the
pointer to B points. However, Basic+ would change the pointer to B to
the value of the pointer to A, thus losing the old location of B. I
think this happened with strings, don't really remember, it has been a
very long time.
That's known as copy by reference instead of copy by value and has
nothing to do with the optimiser. This is part of the language
semantics and each language makes its own decisions in this area.
Post by Dave Froble
Now consider that the pointer to B was in an I/O buffer. After the
operation, B would no longer be pointing into the I/O buffer. Perhaps
not such a good thing. (Actually a horrible thing!)
B = A + ""
That would insure moving of the data, not the pointer. With BP2 and
later, this odd operation was not an issue. But, even today, I find
people still using that old habit of appending the null string to an
operation.
What that does is to create a custom string and point to it, thereby
giving you the same end result as a copy by value.
In some languages, you can force a copy by value on the original
variable (and then you are into all the shallow copy versus deep
copy fun. :-))
Simon.
Can't you just agree that using such an operation when it is not
required is poor programming practice?
--
David Froble Tel: 724-529-0450
Dave Froble Enterprises, Inc. E-Mail: ***@tsoft-inc.com
DFE Ultralights, Inc.
170 Grimplin Road
Vanderbilt, PA 15486
Simon Clubley
2021-10-04 17:37:45 UTC
Permalink
Post by Dave Froble
Can't you just agree that using such an operation when it is not
required is poor programming practice?
I was addressing the other issues raised in your post, not this one,
but on the face of it, if copying of strings is always done by value
these days, then it appears redundant and in an obviously deterministic
way, provided there are no side effects from doing this.

BTW, I don't know DEC Basic, but I have one question based on other
languages. While copying a string in this way appears redundant based
on what you are saying, are there any implicit type conversions that
can be done by adding an empty string to the other value ?

For example, can you convert an integer to a string in this way ?

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
Stephen Hoffman
2021-10-04 18:15:08 UTC
Permalink
Post by Simon Clubley
BTW, I don't know DEC Basic, but I have one question based on other
languages. While copying a string in this way appears redundant based
on what you are saying, are there any implicit type conversions that
can be done by adding an empty string to the other value ?
For example, can you convert an integer to a string in this way ?
Various BASIC compilers including IIRC VB can allow that syntax, but
DEC BASIC doesn't. There are in-line string conversion functions.

DEC BASIC was in various ways both more strict and more flexible than
its BP2 predecessor, including the addition of the OPTION TYPE =
EXPLICIT setting, and the elimination of most line numbers.
--
Pure Personal Opinion | HoffmanLabs LLC
Simon Clubley
2021-10-04 21:37:03 UTC
Permalink
Post by Stephen Hoffman
Post by Simon Clubley
BTW, I don't know DEC Basic, but I have one question based on other
languages. While copying a string in this way appears redundant based
on what you are saying, are there any implicit type conversions that
can be done by adding an empty string to the other value ?
For example, can you convert an integer to a string in this way ?
Various BASIC compilers including IIRC VB can allow that syntax, but
DEC BASIC doesn't. There are in-line string conversion functions.
DEC BASIC was in various ways both more strict and more flexible than
its BP2 predecessor, including the addition of the OPTION TYPE =
EXPLICIT setting, and the elimination of most line numbers.
Thanks Stephen. It's been a _very_ long time since I last used Basic.

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
Simon Clubley
2021-10-04 21:35:55 UTC
Permalink
Post by Simon Clubley
BTW, I don't know DEC Basic, but I have one question based on other
languages. While copying a string in this way appears redundant based
on what you are saying, are there any implicit type conversions that
can be done by adding an empty string to the other value ?
For example, can you convert an integer to a string in this way ?
You could try.
:-)

Simon doesn't have DEC Basic installed and forgot it's installed on
Eisner...
s = v + ""
..............^
%BASIC-E-ILLMODMIX, illegal mode mixing
Thanks for trying the syntax and posting the result Arne.

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
Dave Froble
2021-10-04 21:56:02 UTC
Permalink
Post by Simon Clubley
Post by Dave Froble
Can't you just agree that using such an operation when it is not
required is poor programming practice?
I was addressing the other issues raised in your post, not this one,
but on the face of it, if copying of strings is always done by value
these days, then it appears redundant and in an obviously deterministic
way, provided there are no side effects from doing this.
BTW, I don't know DEC Basic, but I have one question based on other
languages. While copying a string in this way appears redundant based
on what you are saying, are there any implicit type conversions that
can be done by adding an empty string to the other value ?
For example, can you convert an integer to a string in this way ?
Simon.
No.

That whole story was to show that once a person learns something, when
no longer necessary, it's hard to break the habit.

A string concatenation is not a trivial operation. I hate to think of
the wasted CPU cycles.
--
David Froble Tel: 724-529-0450
Dave Froble Enterprises, Inc. E-Mail: ***@tsoft-inc.com
DFE Ultralights, Inc.
170 Grimplin Road
Vanderbilt, PA 15486
Dave Froble
2021-10-04 21:58:08 UTC
Permalink
Post by Simon Clubley
BTW, I don't know DEC Basic, but I have one question based on other
languages. While copying a string in this way appears redundant based
on what you are saying, are there any implicit type conversions that
can be done by adding an empty string to the other value ?
For example, can you convert an integer to a string in this way ?
You could try.
s = v + ""
..............^
%BASIC-E-ILLMODMIX, illegal mode mixing
Arne
Basic uses built in functions for such.

B$ = NUM1$(123)
B = VAL(B$)
--
David Froble Tel: 724-529-0450
Dave Froble Enterprises, Inc. E-Mail: ***@tsoft-inc.com
DFE Ultralights, Inc.
170 Grimplin Road
Vanderbilt, PA 15486
Lawrence D’Oliveiro
2021-10-05 00:30:11 UTC
Permalink
Post by Simon Clubley
For example, can you convert an integer to a string in this way ?
Languages which provide that facility inevitably end up regretting it.
Arne Vajhøj
2021-10-05 19:42:17 UTC
Permalink
Post by Lawrence D’Oliveiro
Post by Simon Clubley
For example, can you convert an integer to a string in this way ?
Languages which provide that facility inevitably end up regretting it.
Various VB flavors, Java, PHP, C# etc. does not seem to have regretted
it yet.

Arne

chris
2021-10-03 15:47:58 UTC
Permalink
Post by Jan-Erik Söderholm
Post by chris
So, it really comes down to optimiser choice of instruction, depending
on the use of the volatile keyword. Perhaps optimisers might optimise
such reads out, but that common ?.
Optimizing out code not needed is one of the most common things that
optimizers does. A very short example.
int a;
int b;
void main() {
a = 10;
b = a;
b = a;
}
The second assignment to b will simply be removed since it is
a duplicate of the first and the value of a has not changed, at
least not as far as the compiler can see.
int volatile a;
int b;
void main() {
a = 10;
b = a;
b = a;
}
This tells the compiler that the value of a can change outside of
the current compile unit and both assignments will be left to be
done and the value of b could be different after each assignment.
The first generates 7 Alpha instructions and the second 9 instructions.
Would be interesting to see instructions generated, so we can see what
is happening. Instruction count alone doesn't say much.
More interesting than you might thnk. Here's an x86 version.
.text
.globl main
.cfi_startproc
movl $10, a(%rip)
movl $10, b(%rip)
ret
Years since X86 asm and it's more complicated now, since there is
Intel asm format for windows and AT&T format for Linux, so assume this
is AT&T <inst> <src> <dest> format.

So 10 copied to a and b, rather than first to a, then b from a, as
per source. Second copy optimised out, as expected.
.text
.globl main
.cfi_startproc
movl $10, a(%rip)
movl a(%rip), %eax
movl a(%rip), %eax
movl %eax, b(%rip)
ret
.cfi_endproc
So in this case, gone round the houses a bit more, using
an intermediate register, but faithfully retained the
redundant second copy.

One of the points I was trying make earlier was that code
should never depend on optimisation level, and for app
code, there are better high level methods for
controlling shared access, such as signals or callbacks
to encapsulate an async process.

Good tech discussion this though :-)...

Chris
Most interesting.
bill
Jan-Erik Söderholm
2021-10-03 18:53:23 UTC
Permalink
Post by chris
Post by Jan-Erik Söderholm
Post by chris
So, it really comes down to optimiser choice of instruction, depending
on the use of the volatile keyword. Perhaps optimisers might optimise
such reads out, but that common ?.
Optimizing out code not needed is one of the most common things that
optimizers does. A very short example.
int a;
int b;
void main() {
a = 10;
b = a;
b = a;
}
The second assignment to b will simply be removed since it is
a duplicate of the first and the value of a has not changed, at
least not as far as the compiler can see.
int volatile a;
int b;
void main() {
a = 10;
b = a;
b = a;
}
This tells the compiler that the value of a can change outside of
the current compile unit and both assignments will be left to be
done and the value of b could be different after each assignment.
The first generates 7 Alpha instructions and the second 9 instructions.
Would be interesting to see instructions generated, so we can see what
is happening. Instruction count alone doesn't say much.
More interesting than you might thnk. Here's an x86 version.
.text
.globl main
.cfi_startproc
movl $10, a(%rip)
movl $10, b(%rip)
ret
Years since X86 asm and it's more complicated now, since there is
Intel asm format for windows and AT&T format for Linux, so assume this is
AT&T <inst> <src> <dest> format.
So 10 copied to a and b, rather than first to a, then b from a, as
per source. Second copy optimised out, as expected.
.text
.globl main
.cfi_startproc
movl $10, a(%rip)
movl a(%rip), %eax
movl a(%rip), %eax
movl %eax, b(%rip)
ret
.cfi_endproc
So in this case, gone round the houses a bit more, using
an intermediate  register, but faithfully retained the
redundant second copy.
Well, the whole point of this discussion is that the second
copy is *NOT* redundant. Adding "volatile" makes that clear.

The value of a *could* have changed between the two reads.
That is exactly what "volatile" is all about.
Post by chris
One of the points I was trying make earlier was that code
should never depend on optimisation level,...
If you add "volatile" where it is needed, you do not need to
depend on any optimisation level. And one "volitile" too
much is usually better then one to little.
Post by chris
and for app
code, there are better high level methods for
controlling shared access, such as signals or callbacks
to encapsulate an async process.
If a had been pointing to some hardware register, that is
updated by the hardware and not by other code, no signals,
callbacks or any other coding technique can change that.
*The* way (in C) is to add "volatile" to the varaible.

It has nothing to do with coding style or such, only with
proper understanding of the context the code is running in.
Post by chris
Good tech discussion this though :-)...
Chris
Most interesting.
bill
chris
2021-09-24 14:28:46 UTC
Permalink
If you look at that use of volatile, it's dealing with sig_atomic,
which I would guess to be an interface to a test and set instruction,
which is designed to be indivisible and non interuptable. That is,
the whole instruction always executes to completion.
More like driver level code, not application, where such
functionality would normally be encapsulated into a system call.
Volatile is also set (quite correctly) on the buffer itself.
Simon.
Why ?. While the compiler will typically pad out structures to align
each element to the machine wordsize, the use of volatile to define
that buffer is redundant, since no optimisation would apply to
that structure definition anyway.

Use structure overlays for machine register access all the time here
and would never use the volatile keyword for any of it. Machine
register structure pointers, yes, volatile is appropriate and
necessary there...

Chris
Bob Gezelter
2021-09-22 19:58:16 UTC
Permalink
Post by Simon Clubley
Post by John Reagan
Simon,
Since the days of RSX-11M, I have been dealing with client bugs in this area.. The best phrasing I have seen in this area was in an IBM System/360 Principles of Operation manual. It may have only appeared in certain editions, as I cannot find the precise reference. However, it was along the lines of "the contents of a buffer are UNDEFINED [emphasis mine] from the initiation of the I/O operation until the operation has completed with the device end signal from the device."
In OpenVMS speak, the above translates as: "The contents of the buffer are undefined from the issuance of the QIO system call until such time as the I/O is completed, signaled by the queueing of an AST; setting of an event flag; or the setting of the completion code in the IOSB."
That isn't the concern Bob.
The concern is, given the highly asynchronous nature of VMS I/O and
of some VMS system calls in general, and given the more aggressive
LLVM optimiser, does the generated code always correctly re-read the
current contents of buffers and variables without having to mark those
buffers/variables as volatile ?
Or are there enough sequence points in VMS application code where these
buffers and variables are accessed that this may turn out not to be a
problem in most cases ?
In essence, the VMS system call and I/O system is behaving much more
like the kinds of things you see in embedded bare-metal programming
than in the normal synchronous model you see in the Unix world.
There's a reason why volatile is used so liberally in embedded bare-metal
programming. :-)
Simon.
--
Walking destinations on a map are further away than they appear.
Simon,

Good technical question.

In general, optimizers work within basic blocks. The example of concern is not a single basic block.

A basic block is a section of code with one entry and one exit. Simple IF statements fall within that category. However, any out-of-line code invocation does not.

The presence of the SYS$QIO system service, which one way or another involves a CALL, ends the basic block, as the optimizer cannot know what is modified by the out-of-line call or its descendants.

- Bob Gezelter, http://www.rlgsc.com
Simon Clubley
2021-09-22 20:25:59 UTC
Permalink
Post by John Reagan
Simon,
Good technical question.
In general, optimizers work within basic blocks. The example of concern is not a single basic block.
A basic block is a section of code with one entry and one exit. Simple IF statements fall within that category. However, any out-of-line code invocation does not.
The presence of the SYS$QIO system service, which one way or another involves a CALL, ends the basic block, as the optimizer cannot know what is modified by the out-of-line call or its descendants.
But VMS writes directly into your process space at some random time
X later _after_ you have returned from sys$qio() and are potentially
busy doing something else.

From the viewpoint of the application, it's exactly the same as hardware
choosing to write an updated value into a register while your bare-metal
code is busy doing something else.

How does the compiler know VMS has done that or are there enough
sequence points even in the VMS asynchronous I/O model for this
to still work fine without having to use the volatile attribute,
even in the presence of an highly aggressive optimising compiler ?

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
Arne Vajhøj
2021-09-22 23:22:39 UTC
Permalink
Post by Simon Clubley
Post by Bob Gezelter
In general, optimizers work within basic blocks. The example of concern is not a single basic block.
A basic block is a section of code with one entry and one exit. Simple IF statements fall within that category. However, any out-of-line code invocation does not.
The presence of the SYS$QIO system service, which one way or another involves a CALL, ends the basic block, as the optimizer cannot know what is modified by the out-of-line call or its descendants.
But VMS writes directly into your process space at some random time
X later _after_ you have returned from sys$qio() and are potentially
busy doing something else.
From the viewpoint of the application, it's exactly the same as hardware
choosing to write an updated value into a register while your bare-metal
code is busy doing something else.
How does the compiler know VMS has done that or are there enough
sequence points even in the VMS asynchronous I/O model for this
to still work fine without having to use the volatile attribute,
even in the presence of an highly aggressive optimising compiler ?
This is a bit outside my area of expertise.

But wouldn't a flow like:
- call SYS$QIO with buffer
- do something
- wait for IO to complete
- __MB()
- use buffer
work?

Arne
Jan-Erik Söderholm
2021-09-22 23:53:38 UTC
Permalink
Post by Arne Vajhøj
Post by Simon Clubley
Post by Bob Gezelter
In general, optimizers work within basic blocks. The example of concern
is not a single basic block.
A basic block is a section of code with one entry and one exit. Simple
IF statements fall within that category. However, any out-of-line code
invocation does not.
The presence of the SYS$QIO system service, which one way or another
involves a CALL, ends the basic block, as the optimizer cannot know what
is modified by the out-of-line call or its descendants.
But VMS writes directly into your process space at some random time
X later _after_ you have returned from sys$qio() and are potentially
busy doing something else.
 From the viewpoint of the application, it's exactly the same as hardware
choosing to write an updated value into a register while your bare-metal
code is busy doing something else.
How does the compiler know VMS has done that or are there enough
sequence points even in the VMS asynchronous I/O model for this
to still work fine without having to use the volatile attribute,
even in the presence of an highly aggressive optimising compiler ?
This is a bit outside my area of expertise.
- call SYS$QIO with buffer
- do something
- wait for IO to complete
- __MB()
- use buffer
work?
Arne
One important point is of course to not use the buffer until the
QIO related to that buffer completes. But that is not really what
Simon is talkning/asking about.

Simon is refering to the well known issue on platforms where a
variable can directly refer to some data that might get updated
outside of the controle of the application code (and then also
not being able to be analysed by a compiler optimizer.

One very common case is where a variable refers to an "port"
on a microcontroller. The port is connected to some real life
equipment such as push buttons, relays or what ever. Those items
can be handled totaly out of control from the application code.

In those cases, it is very common that the compiler says "this
variable has not been updated, so I'll just use the value from
the last read that I already have in a register anyway". And
then missing some push button being pressed.

That is where you say "volatile" to disable any such optimization and
force the compiler to always re-read the variable from the source.

AST, does in a way look like this, some data (the buffer) in the app
is changed without this beeing obvious from just looking at the code.
That is, as the compiler is doing. That is what Simon is asking about.

I am very well aware of these issues with microcontrollers from a long
time programming "8-bitters" in the Microchip PIC family. I have no
idea how this relates to ASTs...


Jan-Erik.
Arne Vajhøj
2021-09-23 00:07:15 UTC
Permalink
Post by Jan-Erik Söderholm
Post by Arne Vajhøj
Post by Simon Clubley
Post by Bob Gezelter
In general, optimizers work within basic blocks. The example of
concern is not a single basic block.
A basic block is a section of code with one entry and one exit.
Simple IF statements fall within that category. However, any
out-of-line code invocation does not.
The presence of the SYS$QIO system service, which one way or another
involves a CALL, ends the basic block, as the optimizer cannot know
what is modified by the out-of-line call or its descendants.
But VMS writes directly into your process space at some random time
X later _after_ you have returned from sys$qio() and are potentially
busy doing something else.
 From the viewpoint of the application, it's exactly the same as hardware
choosing to write an updated value into a register while your bare-metal
code is busy doing something else.
How does the compiler know VMS has done that or are there enough
sequence points even in the VMS asynchronous I/O model for this
to still work fine without having to use the volatile attribute,
even in the presence of an highly aggressive optimising compiler ?
This is a bit outside my area of expertise.
- call SYS$QIO with buffer
- do something
- wait for IO to complete
- __MB()
- use buffer
work?
One important point is of course to not use the buffer until the
QIO related to that buffer completes. But that is not really what
Simon is talkning/asking about.
That is a given. And why I had the "wait for IO to complete".
Post by Jan-Erik Söderholm
Simon is refering to the well known issue on platforms where a
variable can directly refer to some data that might get updated
outside of the controle of the application code (and then also
not being able to be analysed by a compiler optimizer.
One very common case is where a variable refers to an "port"
on a microcontroller. The port is connected to some real life
equipment such as push buttons, relays or what ever. Those items
can be handled totaly out of control from the application code.
In those cases, it is very common that the compiler says "this
variable has not been updated, so I'll just use the value from
the last read that I already have in a register anyway". And
then missing some push button being pressed.
That is where you say "volatile" to disable any such optimization and
force the compiler to always re-read the variable from the source.
Yes.

But instead of spreading out volatile keyword wouldn't
__MB() do the same? (on VMS - it is a VMS C specific thing
I believe)

Arne
Jan-Erik Söderholm
2021-09-23 00:16:03 UTC
Permalink
Post by Arne Vajhøj
Post by Jan-Erik Söderholm
Post by Arne Vajhøj
Post by Simon Clubley
Post by Bob Gezelter
In general, optimizers work within basic blocks. The example of
concern is not a single basic block.
A basic block is a section of code with one entry and one exit. Simple
IF statements fall within that category. However, any out-of-line code
invocation does not.
The presence of the SYS$QIO system service, which one way or another
involves a CALL, ends the basic block, as the optimizer cannot know
what is modified by the out-of-line call or its descendants.
But VMS writes directly into your process space at some random time
X later _after_ you have returned from sys$qio() and are potentially
busy doing something else.
 From the viewpoint of the application, it's exactly the same as hardware
choosing to write an updated value into a register while your bare-metal
code is busy doing something else.
How does the compiler know VMS has done that or are there enough
sequence points even in the VMS asynchronous I/O model for this
to still work fine without having to use the volatile attribute,
even in the presence of an highly aggressive optimising compiler ?
This is a bit outside my area of expertise.
- call SYS$QIO with buffer
- do something
- wait for IO to complete
- __MB()
- use buffer
work?
One important point is of course to not use the buffer until the
QIO related to that buffer completes. But that is not really what
Simon is talkning/asking about.
That is a given. And why I had the "wait for IO to complete".
Post by Jan-Erik Söderholm
Simon is refering to the well known issue on platforms where a
variable can directly refer to some data that might get updated
outside of the controle of the application code (and then also
not being able to be analysed by a compiler optimizer.
One very common case is where a variable refers to an "port"
on a microcontroller. The port is connected to some real life
equipment such as push buttons, relays or what ever. Those items
can be handled totaly out of control from the application code.
In those cases, it is very common that the compiler says "this
variable has not been updated, so I'll just use the value from
the last read that I already have in a register anyway". And
then missing some push button being pressed.
That is where you say "volatile" to disable any such optimization and
force the compiler to always re-read the variable from the source.
Yes.
But instead of spreading out volatile keyword wouldn't
__MB() do the same? (on VMS - it is a VMS C specific thing
I believe)
Arne
Sorry. Seems to be something about a "memory barrier". I don't know
what that is in this context and if it works as an volatile.
Simon Clubley
2021-09-23 12:24:36 UTC
Permalink
Post by Jan-Erik Söderholm
Post by Arne Vajhøj
But instead of spreading out volatile keyword wouldn't
__MB() do the same? (on VMS - it is a VMS C specific thing
I believe)
Sorry. Seems to be something about a "memory barrier". I don't know
what that is in this context and if it works as an volatile.
Assuming __MB() is a hardware memory barrier operation, the answer is no.

Memory barrier instructions are designed to get bits of hardware back
into sync with each other so that when you read something it is valid.

Volatile OTOH is a purely software construct and is used to tell the
compiler to insert bits of code into the generated code to _always_
first re-read the memory location even if the compiler thinks from
looking at the source code that the value could not have changed.

There's no point getting the hardware back into sync, if the generated
code is missing the bit to then unconditionally read that value again
before doing something with the variable.

Even if memory barriers could be made to so the same thing somehow
by having the compiler look for them, they would have to be inserted
within the executable code. Volatile however is a variable definition
attribute so only ever appears in the source code when the volatile
variables are initially defined.

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
Arne Vajhøj
2021-09-24 15:41:39 UTC
Permalink
Post by Simon Clubley
Post by Jan-Erik Söderholm
Post by Arne Vajhøj
But instead of spreading out volatile keyword wouldn't
__MB() do the same? (on VMS - it is a VMS C specific thing
I believe)
Sorry. Seems to be something about a "memory barrier". I don't know
what that is in this context and if it works as an volatile.
Assuming __MB() is a hardware memory barrier operation, the answer is no.
Memory barrier instructions are designed to get bits of hardware back
into sync with each other so that when you read something it is valid.
Volatile OTOH is a purely software construct and is used to tell the
compiler to insert bits of code into the generated code to _always_
first re-read the memory location even if the compiler thinks from
looking at the source code that the value could not have changed.
There's no point getting the hardware back into sync, if the generated
code is missing the bit to then unconditionally read that value again
before doing something with the variable.
Even if memory barriers could be made to so the same thing somehow
by having the compiler look for them, they would have to be inserted
within the executable code. Volatile however is a variable definition
attribute so only ever appears in the source code when the volatile
variables are initially defined.
__MB should ensure that when the code reads from memory
it should get the latest value.

A buffer with 100 or 1000 or 10000 bytes can not be in
a register (at least not on x86-64) so reading the buffer
will mean reading from memory.

And if __MB ensure that reading from memory will get the
latest value then ...

Arne
Simon Clubley
2021-09-24 18:54:40 UTC
Permalink
Post by Arne Vajhøj
Post by Simon Clubley
Post by Jan-Erik Söderholm
Post by Arne Vajhøj
But instead of spreading out volatile keyword wouldn't
__MB() do the same? (on VMS - it is a VMS C specific thing
I believe)
Sorry. Seems to be something about a "memory barrier". I don't know
what that is in this context and if it works as an volatile.
Assuming __MB() is a hardware memory barrier operation, the answer is no.
Memory barrier instructions are designed to get bits of hardware back
into sync with each other so that when you read something it is valid.
Volatile OTOH is a purely software construct and is used to tell the
compiler to insert bits of code into the generated code to _always_
first re-read the memory location even if the compiler thinks from
looking at the source code that the value could not have changed.
There's no point getting the hardware back into sync, if the generated
code is missing the bit to then unconditionally read that value again
before doing something with the variable.
Even if memory barriers could be made to so the same thing somehow
by having the compiler look for them, they would have to be inserted
within the executable code. Volatile however is a variable definition
attribute so only ever appears in the source code when the volatile
variables are initially defined.
__MB should ensure that when the code reads from memory
it should get the latest value.
__MB() puts the hardware holding the contents of that variable into sync.
Volatile OTOH puts the generated code for that variable into sync by not
caching the variable but instead re-reading it every time.

IOW this only works if the code _does_ read from memory every time
at which point you don't need the memory barrier anyway, at least not
for this. You may still need a memory barrier down inside the device
drivers or in the kernel, but that's nothing to do with working around
the compiler generating code to cache the variable instead of re-reading
it every time.

If you could somehow get this to work, you would also have to manually
insert the __MB() instructions throughout your code instead of just
tagging the variable as volatile and letting the compiler add code
to do a re-read automatically.
Post by Arne Vajhøj
A buffer with 100 or 1000 or 10000 bytes can not be in
a register (at least not on x86-64) so reading the buffer
will mean reading from memory.
That's non-deterministic. What if the code only looks at the first
longword in the buffer ? A longword that it looked at previously
when the buffer had previous contents ? Oops... :-)

A __MB() call here would make no difference to that behaviour.

Simon.
--
Simon Clubley, ***@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.
Lawrence D’Oliveiro
2021-09-23 01:58:40 UTC
Permalink
Post by Jan-Erik Söderholm
One very common case is where a variable refers to an "port"
on a microcontroller. The port is connected to some real life
equipment such as push buttons, relays or what ever. Those items
can be handled totaly out of control from the application code.
This is memory-mapped I/O: those addresses don’t actually access real memory, and do not have normal memory semantics.

Some CPU architectures use special I/O instructions for this purpose, and won’t have this problem. This is why you have philosophical* debates about which is the better approach ...

*maybe even verging on religious
John Reagan
2021-09-23 00:18:07 UTC
Permalink
Post by Simon Clubley
Post by John Reagan
Simon,
Good technical question.
In general, optimizers work within basic blocks. The example of concern is not a single basic block.
A basic block is a section of code with one entry and one exit. Simple IF statements fall within that category. However, any out-of-line code invocation does not.
The presence of the SYS$QIO system service, which one way or another involves a CALL, ends the basic block, as the optimizer cannot know what is modified by the out-of-line call or its descendants.
But VMS writes directly into your process space at some random time
X later _after_ you have returned from sys$qio() and are potentially
busy doing something else.
From the viewpoint of the application, it's exactly the same as hardware
choosing to write an updated value into a register while your bare-metal
code is busy doing something else.
How does the compiler know VMS has done that or are there enough
sequence points even in the VMS asynchronous I/O model for this
to still work fine without having to use the volatile attribute,
even in the presence of an highly aggressive optimising compiler ?
Simon.
--
Walking destinations on a map are further away than they appear.
So LLVM and gcc are good optimizers. They battle with each other all the time for everybody's benefit. However, GEM is a pretty good optimizer too. Alpha GEM code is really good. It started wobbly with EV4 but grew into a tight code generator. GEM Itanium does not take advantage of the machines speculative or advance loads which puts it behind the HPUX compiler but it holds its own.

LLVM's has many optimization passes for specific targets. The whole list is not run on every target.

For LLVM, it is way more than just "volatile" loads and stores. Optimizers need way more than that. For LLVM, look at

https://llvm.org/docs/AliasAnalysis.html
https://llvm.org/docs/Passes.html
https://llvm.org/docs/Atomics.html
https://llvm.org/docs/LangRef.html#tbaa-metadata
https://llvm.org/docs/MemorySSA.html


BTW, GEM also has a TBAA mechanism which uses callbacks from GEM back to the frontend to ask about alias-information. It allows the alias analysis to be specific to the language semantics.
chris
2021-09-24 22:16:27 UTC
Permalink
orld.
Post by Simon Clubley
There's a reason why volatile is used so liberally in embedded bare-metal
programming. :-)
Simon.
Primarily because embedded spends a lot of time accessing hardware
registers directly, for example:

volatile unsigned char *ttyport = (volatile unsigned char*) TTY_PORT

which assigns a numerc value to the pointer and tells the compiler
not to optimise it away, nor change the value.

Something application level code should rarely, if ever, see...

Chris
Dave Froble
2021-09-24 23:06:42 UTC
Permalink
Post by chris
orld.
Post by Simon Clubley
There's a reason why volatile is used so liberally in embedded bare-metal
programming. :-)
Simon.
Primarily because embedded spends a lot of time accessing hardware
volatile unsigned char *ttyport = (volatile unsigned char*) TTY_PORT
which assigns a numerc value to the pointer and tells the compiler
not to optimise it away, nor change the value.
Something application level code should rarely, if ever, see...
Well, now, that sort of depends on your definition of "application
level code", doesn't it?

I sometimes design/write stuff where I consider such issues.

Of course I write it in Basic, not that shitty C stuff. Basic seems to
usually get things right. (Don't tell John I wrote that, he'll hold it
against me when I ask him to fix Basic.)

:-)
--
David Froble Tel: 724-529-0450
Dave Froble Enterprises, Inc. E-Mail: ***@tsoft-inc.com
DFE Ultralights, Inc.
170 Grimplin Road
Vanderbilt, PA 15486
chris
2021-09-26 17:33:02 UTC
Permalink
Post by chris
orld.
Post by Simon Clubley
There's a reason why volatile is used so liberally in embedded bare-metal
programming. :-)
Simon.
Primarily because embedded spends a lot of time accessing hardware
volatile unsigned char *ttyport = (volatile unsigned char*) TTY_PORT
which assigns a numerc value to the pointer and tells the compiler
not to optimise it away, nor change the value.
Something application level code should rarely, if ever, see...
Well, now, that sort of depends on your definition of "application level
code", doesn't it?
I sometimes design/write stuff where I consider such issues.
Of course I write it in Basic, not that shitty C stuff. Basic seems to
usually get things right. (Don't tell John I wrote that, he'll hold it
against me when I ask him to fix Basic.)
:-)
I never took to Basic, but it does have keywords like peek and poke to
to shoot through all the os layers and protections right down to the
hardware :-). If I had to program Basic, I think I would try to avoid
that. Fine on an Apple II at the time though...

C is just the same of course, but it was designed as a systems
programming language from the start...

Chris
gah4
2021-09-24 23:52:56 UTC
Permalink
Post by John Reagan
Post by Simon Clubley
Jan-Erik's questions about ASTs in COBOL have reminded me about something
I asked a while back.
VMS I/O and system calls are much more asynchronous than on other operating
systems and data can appear in buffers and variables in general can be
changed outside of the normal sequence points (such as at a function call
boundary).
With the move to LLVM, and its different optimiser, have any examples
appeared in VMS code for x86-64 where volatile attributes are now required
on variable definitions where you would have got away with not using them
before (even if technically, they should have been marked as volatile anyway) ?
Just curious if there's any places in code running on VMS x86-64 that will
need to cleaned up to do things in the correct way that you would have
got away with doing less correctly previously.
Simon
--
Walking destinations on a map are further away than they appear.
Simon,
Since the days of RSX-11M, I have been dealing with client bugs in this area..
The best phrasing I have seen in this area was in an IBM System/360 Principles of
Operation manual. It may have only appeared in certain editions, as I cannot
find the precise reference. However, it was along the lines of "the contents of a
buffer are UNDEFINED [emphasis mine] from the initiation of the I/O operation
until the operation has completed with the device end signal from the device."
That would apply at the hardware level.

But BSAM (and other B...) I/O operations are almost at that level.
(QSAM is different, and at a higher level.)

A BSAM I/O call (which is pretty much a subroutine call into an OS routine)
eventually does an EXCP, which then does the appropriate SIO to start the
I/O operation. The program should then (later) WAIT for it to complete,
and as with the description above, the buffer is undefined.

(That should be described in the appropriate manual for I/O macros.)

For the 360/85, and many S/370 models, cache made things more
interesting. Since I/O operations go directly to memory, when does
the cache get updated?

I do wonder, though, if QIO is closer to the OS/360 Queued access
methods, like QSAM.

Note also that BSAM allows for, and also PL/I, locate mode I/O
where the hardware reads/writes directly from the actual data
arrays, without any intermediate buffering. (That only works
for contiguous data.)

I am not sure how VMS does I/O buffering.
Loading...