Discussion:
[hercules-390] Problem with Instrate
'Jean-Louis Noel' jln@stben.net [hercules-390]
2017-03-06 13:55:47 UTC
Permalink
Hi Everyone,



I got a problem with INSTRATE :



- 14.32.21 JOB02331 $HASP373 INSTRATE STARTED - INIT 1 - CLASS A - SYS

- SYS1

- 14.32.21 JOB02331 +INSTRATE STARTED

- 14.32.21 JOB02331 + 1 intel i7, z/os 1.10, 4.00

- 14.32.21 JOB02331 + Ranked by usage (general)

- 14.32.21 JOB02331 + Samples = 9

- 14.32.21 JOB02331 + Loops = 1000000

- 14.32.21 JOB02331 + Replicate = 1

- 14.32.21 JOB02331 + Tolerate% = 10

- 14.32.21 JOB02331 + Pgm addr = 00007000

- 14.32.21 JOB02331 + DATA addr = 00007A38

- 14.32.21 JOB02331 + DATAB addr = 000083F8

- 14.32.21 JOB02331 + DATAP addr = 00007FF8

- 14.32.21 JOB02331 + Instruction addr =000090148

- 14.32.21 JOB02331 +Description MIPS nS Samples

- 14.32.22 JOB02331 +BCTR Rx,Rloop (refference loop) 64.11 15 21

- 14.32.25 JOB02331 +ELAPSED time exceeds 128 seconds

00- 14.32.27 JOB02331 IEF404I INSTRATE - ENDED - TIME=14.32.27

14.32.27 JOB02331 $HASP150 INSTRATE OUTGRP=2.1.1 ON PRT1 10,011 (10,

011) RECORDS



The assumption ‘+ELAPSED time exceeds 128 seconds’ is a bit overstated!



You can see the dump, if needed, at : http://www.stben.net/hercules/INSTRATE.IBMUSER.j02331cA.pdf



Thanks for your help.



Bye,

Jean-Louis
'\'Fish\' (David B. Trout)' david.b.trout@gmail.com [hercules-390]
2017-03-06 17:27:40 UTC
Permalink
Wrong forum.

Try posting your question here instead:


https://groups.yahoo.com/neo/groups/H390-MVS/info
--
"Fish" (David B. Trout)
Software Development Laboratories
http://www.softdevlabs.com
mail: ***@softdevlabs.com
'Jean-Louis Noel' jln@stben.net [hercules-390]
2017-03-07 10:50:18 UTC
Permalink
Hi David,
Post by '\'Fish\' (David B. Trout)' ***@gmail.com [hercules-390]
Wrong forum.
I like very much your unmovable faith into the precision of hercules' timer.

Yours faithfully,
Jean-Louis
somitcw@yahoo.com [hercules-390]
2017-03-07 13:52:12 UTC
Permalink
Post by 'Jean-Louis Noel' ***@stben.net [hercules-390]
Hi David,
Post by '\'Fish\' (David B. Trout)' ***@gmail.com [hercules-390]
Wrong forum.
I like very much your unmovable faith into the precision of hercules' timer.
Yours faithfully,
Jean-Louis
There was no report of a Hercules issue in a timer or other.
There was no Hercules release even mentioned.
There was no hardware even mentioned.
There was no P.C. operating system mentioned.
There was no mention of what you think your issue is.
People trying to guess what you are thinking about might not
get it right.

There was only a dump of an application program displaying
a strange message on the console and then taking a PIC 1

Since the issue as documented is not a VM or DOS/VS issue,
feel free to post on one of those forums.
You can also find thousands of other unrelated forums to
not document your issue on.
'\'Fish\' (David B. Trout)' david.b.trout@gmail.com [hercules-390]
2017-03-07 13:53:16 UTC
Permalink
Hi Jean-Louis,

I like very much your unintelligible messages posted to the wrong forum.

Yours faithfully,
--
"Fish" (David B. Trout)
Software Development Laboratories
http://www.softdevlabs.com
mail: ***@softdevlabs.com
Maarten Hoes hoes.maarten@gmail.com [hercules-390]
2017-03-07 15:30:54 UTC
Permalink
Hi Jean-Louis Noel,


I am by no means an expert on these matters, so please forgive me my
ignorance, but :

The subject of your message states that you have a problem with Instrate.
However, it seems that from the body text of your message people cannot
tell what that problem is exactly. You reference the statement "+ELAPSED
time exceeds 128 seconds", which you seem to feel is wrong ?

Perhaps you could elaborate a little with telling what it is you are trying
to achieve, how you go about doing that, what you expect the results to be,
and what the actual results are ?



- Maarten
Post by 'Jean-Louis Noel' ***@stben.net [hercules-390]
Hi Everyone,
- 14.32.21 JOB02331 $HASP373 INSTRATE STARTED - INIT 1 - CLASS A - SYS
- SYS1
- 14.32.21 JOB02331 +INSTRATE STARTED
- 14.32.21 JOB02331 + 1 intel i7, z/os 1.10,
4.00
- 14.32.21 JOB02331 + Ranked by usage (general)
- 14.32.21 JOB02331 + Samples = 9
- 14.32.21 JOB02331 + Loops = 1000000
- 14.32.21 JOB02331 + Replicate = 1
- 14.32.21 JOB02331 + Tolerate% = 10
- 14.32.21 JOB02331 + Pgm addr = 00007000
- 14.32.21 JOB02331 + DATA addr = 00007A38
- 14.32.21 JOB02331 + DATAB addr = 000083F8
- 14.32.21 JOB02331 + DATAP addr = 00007FF8
- 14.32.21 JOB02331 + Instruction addr =000090148
- 14.32.21 JOB02331 +Description MIPS nS Samples
- 14.32.22 JOB02331 +BCTR Rx,Rloop (refference loop) 64.11
15 21
- 14.32.25 JOB02331 +ELAPSED time exceeds 128
seconds
00- 14.32.27 JOB02331 IEF404I INSTRATE - ENDED -
TIME=14.32.27
14.32.27 JOB02331 $HASP150 INSTRATE OUTGRP=2.1.1 ON PRT1 10,011 (10,
011) RECORDS
The assumption ‘+ELAPSED time exceeds 128 seconds’ is a bit overstated!
You can see the dump, if needed, at : http://www.stben.net/hercules/
INSTRATE.IBMUSER.j02331cA.pdf
Thanks for your help.
Bye,
Jean-Louis
opplr@hotmail.com [hercules-390]
2017-03-07 16:33:13 UTC
Permalink
Wow, wow wow,

When did the forum turn into a place which tells someone to go someplace else ?

The fellow who was attempting to find a problem in console.c (? may have that wrong it might have been 3270 emulation ) was literally chased out.

Jean, the problem with INSTRATE is it was written back when hercules and/or real IRON couldn't run very fast. As processors got much faster there wasn't enough time between x number of loops for the timer to decrement so you get what to the program see as no movement of the clock. It then reports the 128 seconds message.

If you ask it to do a lot more loops it may start giving results. The quickly emulated instructions may be the hardest to get to work while the things like MVCL (?) may actually show results as is.

Bumping up the loop count a lot ( x 10 - x 100 or more ) may permit some results also.

All this is from memory and the last time I tinkered with it was years ago.

Phil


---In hercules-***@yahoogroups.com, <***@...> wrote :

Hi Jean-Louis Noel,


I am by no means an expert on these matters, so please forgive me my ignorance, but :

The subject of your message states that you have a problem with Instrate. However, it seems that from the body text of your message people cannot tell what that problem is exactly. You reference the statement "+ELAPSED time exceeds 128 seconds", which you seem to feel is wrong ?

Perhaps you could elaborate a little with telling what it is you are trying to achieve, how you go about doing that, what you expect the results to be, and what the actual results are ?



- Maarten




On Mon, Mar 6, 2017 at 2:55 PM, 'Jean-Louis Noel' ***@... mailto:***@... [hercules-390] <hercules-***@yahoogroups.com mailto:hercules-***@yahoogroups.com> wrote:

Hi Everyone,

I got a problem with INSTRATE :


- 14.32.21 JOB02331 $HASP373 INSTRATE STARTED - INIT 1 - CLASS A - SYS
- SYS1
- 14.32.21 JOB02331 +INSTRATE STARTED
- 14.32.21 JOB02331 + 1 intel i7, z/os 1.10, 4.00
- 14.32.21 JOB02331 + Ranked by usage (general)
- 14.32.21 JOB02331 + Samples = 9
- 14.32.21 JOB02331 + Loops = 1000000
- 14.32.21 JOB02331 + Replicate = 1
- 14.32.21 JOB02331 + Tolerate% = 10
- 14.32.21 JOB02331 + Pgm addr = 00007000
- 14.32.21 JOB02331 + DATA addr = 00007A38
- 14.32.21 JOB02331 + DATAB addr = 000083F8
- 14.32.21 JOB02331 + DATAP addr = 00007FF8
- 14.32.21 JOB02331 + Instruction addr =000090148
- 14.32.21 JOB02331 +Description MIPS nS Samples
- 14.32.22 JOB02331 +BCTR Rx,Rloop (refference loop) 64.11 15 21
- 14.32.25 JOB02331 +ELAPSED time exceeds 128 seconds
00- 14.32.27 JOB02331 IEF404I INSTRATE - ENDED - TIME=14.32.27
14.32.27 JOB02331 $HASP150 INSTRATE OUTGRP=2.1.1 ON PRT1 10,011 (10,
011) RECORDS


The assumption ‘+ELAPSED time exceeds 128 seconds’ is a bit overstated!

You can see the dump, if needed, at : http://www.stben.net/hercules/ INSTRATE.IBMUSER.j02331cA.pdf http://www.stben.net/hercules/INSTRATE.IBMUSER.j02331cA.pdf

Thanks for your help.

Bye,
Jean-Louis
'Jean-Louis Noel' jln@stben.net [hercules-390]
2017-03-07 17:34:19 UTC
Permalink
Hi Maarten Phil,
Post by ***@hotmail.com [hercules-390]
"+ELAPSED time exceeds 128 seconds", which you seem to feel is wrong ?
INSTRATE is a program from the files part of this group.
I just assembled it and run it and there was an abend.
If you compare the time printed by the system, to the left, since the
beginning and the abend
it shows 2 seconds and the program says that 128 seconds have passed.
Post by ***@hotmail.com [hercules-390]
If you ask it to do a lot more loops it may start giving results.
Yes, I know now where the problem lies :
The program written by Gary Brabiner calibrates an empty loop.
And then it adds one instruction in that loop and tries to know the
time elapsed for executing that instruction by subtracting the time of the
empty loop.

But because Hercules is host OS dependent the time calibrated previously
is a too loose approximation and with a modern hardware the subtraction
can goes negative has the abend show.

What can be done? I don't know yet. But, perhaps Hercules could claim
exclusive usage of one core for each processor in config file or something
like that. It's only just guesses at this moment.

Last information it runs properly under zPDT and under a z machine.

- Jean-Louis
opplr@hotmail.com [hercules-390]
2017-03-07 19:20:11 UTC
Permalink
Jean-Louis wrote:

"What can be done? I don't know yet. But, perhaps Hercules could claim
exclusive usage of one core for each processor in config file or something
like that. It's only just guesses at this moment.

Last information it runs properly under zPDT and under a z machine."

Don't believe defining only 1 CPU helps this problem.

If it runs unchanged properly on both z/PDT and Z machines, I would say It is a timer/clock problem in Hercules

Phil

ps - not to say Hercules timer/clock doesn't work properly, maybe it just isn't working properly for the demands of this job

pps - I'd like to have a copy of the results of the runs on those 2 platforms
opplr@hotmail.com [hercules-390]
2017-03-07 19:30:14 UTC
Permalink
Just remembered also that I had gone into the program to increase only the empty timer loop so it would have more accuracy, then reverting to specified number of loops for testing loops.

Set loop value x 10 times or larger multiplier
Run empty timer loop executed much larger number of times than testing loops.
Set loop value back to original specified
Run testing loops

Phil
'Jean-Louis Noel' jln@stben.net [hercules-390]
2017-03-07 19:43:58 UTC
Permalink
Hi Phil,
Post by ***@hotmail.com [hercules-390]
then reverting to specified number of loops for testing loops
Yes, it could work I'll try it.
Because who really care of the synchronicity? It's only a testing thing.
We are speaking in 1/100° of second here.

Thanks.
- Jean-Louis
opplr@hotmail.com [hercules-390]
2017-03-08 04:15:09 UTC
Permalink
Hi Jean-Louis,

It may be necessary to adjust Toleration percentage - a lot.

A couple of tests on AMD FX-8320. 8 CPU, Linux running 3.8j 1 CPU defined. TOP shows 36 % of 1 CPU utilized while INSTRATE running.

Original INSTRATE with blank parm for general instructions. Got past the reference loop but hung somewhere after 3 mins canceled.

04.32.50 JOB 98 + General instruction tests
04.32.50 JOB 98 + Samples = 9
04.32.50 JOB 98 + Loops = 1000000
04.32.50 JOB 98 + Replicate = 1
04.32.50 JOB 98 + Tolerate% = 2
04.32.50 JOB 98 + Pgm addr = 000A5000
04.32.50 JOB 98 + DATA addr = 000A5A38
04.32.50 JOB 98 + DATAB addr = 000A63F8
04.32.50 JOB 98 + DATAP addr = 000A5FF8
04.32.50 JOB 98 + Instruction addr =000A70148
04.32.50 JOB 98 +Description MIPS nS Samples
04.34.25 JOB 98 +BCT Rx,loop 68.57 14 1361
04.35.08 JOB 98 +BCTR Rx,Rloop (refference loop) 121.77 8 609
HHC00008I /c herc01a
04.38.53 IEE301I HERC01A CANCEL COMMAND ACCEPTED

Original INSTRATE with blank parm and toleration adjusted to 30 %. Got past reference loop but immediate abend.

04.41.58 JOB 99 + General instruction tests
04.41.58 JOB 99 + Samples = 9
04.41.58 JOB 99 + Loops = 1000000
04.41.58 JOB 99 + Replicate = 1
04.41.58 JOB 99 + Tolerate% = 30
04.41.58 JOB 99 + Pgm addr = 000A5000
04.41.58 JOB 99 + DATA addr = 000A5A38
04.41.58 JOB 99 + DATAB addr = 000A63F8
04.41.58 JOB 99 + DATAP addr = 000A5FF8
04.41.58 JOB 99 + Instruction addr =000A70148
04.41.58 JOB 99 +Description MIPS nS Samples
04.41.59 JOB 99 +BCT Rx,loop 66.39 15 15
04.42.01 JOB 99 +BCTR Rx,Rloop (refference loop) 74.96 13 19
04.42.01 JOB 99 +ELAPSED time exceeds 128 seconds

Sampling much better this time around as toleration loosened. 1st attempt may have hung up in sampling and not making it to 2 min abend.

Original INSTRATE with X parm original 2% toleration. Working but didn't want MVC timings so canceled.

04.24.18 JOB 97 + Testing Instructions
04.24.18 JOB 97 + Samples = 9
04.24.18 JOB 97 + Loops = 1000000
04.24.18 JOB 97 + Replicate = 1
04.24.18 JOB 97 + Tolerate% = 2
04.24.18 JOB 97 + Pgm addr = 000A5000
04.24.18 JOB 97 + DATA addr = 000A5A38
04.24.18 JOB 97 + DATAB addr = 000A63F8
04.24.18 JOB 97 + DATAP addr = 000A5FF8
04.24.18 JOB 97 + Instruction addr =000A70148
04.24.18 JOB 97 +Description MIPS nS Samples
04.24.52 JOB 97 +BCTR Rx,Rloop (refference loop) 75.96 13 499
04.25.07 JOB 97 +MVC DATA(1),DATA 70.67 14 179
04.25.52 JOB 97 +MVC DATA(1),DATAB 71.30 14 555


Like I stated, I did a good bit of tinkering with INSTRATE a while back. The MIPS folder in this forum has a INSTRATE-DEBUG.txt job. It has 30% toleration and maybe some other adjustments ( I can't remember what I uploaded vs what I tinkered with ). Anyway it get further but also abends with 2 min.

04.18.49 JOB 96 + Ranked by usage (general)
04.18.49 JOB 96 + Samples = 9
04.18.49 JOB 96 + Loops = 1000000
04.18.49 JOB 96 + Replicate = 1
04.18.49 JOB 96 + Tolerate% = 30
04.18.49 JOB 96 + Pgm addr = 000A5A38
04.18.49 JOB 96 + DATA addr = 000A6450
04.18.49 JOB 96 + DATAB addr = 000A6E30
04.18.49 JOB 96 + DATAP addr = 000A6A30
04.18.49 JOB 96 + Instruction addr =000A7A4C8
04.18.49 JOB 96 +Description MIPS nS Samples
04.18.52 JOB 96 +Dropping samples 4 and 1, error is 36%
04.18.53 JOB 96 +BCTR Rx,Rloop (refference loop) 74.41 13 11
04.18.57 JOB 96 +A R1,F1 Add 5A-RX 71.98 14 9
04.19.01 JOB 96 +AD R0,DEC1 Add Norm Long 6A-RX 36.87 27 9
04.19.05 JOB 96 +ADR F0,F4 Add Norm Long 2A-RR 64.76 15 9
04.19.09 JOB 96 +AE F0,F4 Add Norm Long 7A-RX 18.44 55 9
04.19.13 JOB 96 +AER F0,F4 Add Norm Shor 3A-RR 73.43 13 9
04.19.17 JOB 96 +AH R1,H1 Add Halfword 4A-RX 71.06 14 9
04.19.21 JOB 96 +Dropping samples 8 and 1, error is 32%
04.19.22 JOB 96 +AL R1,F1 Add Logical 5E-RX 67.54 15 11
04.19.26 JOB 96 +Dropping samples 6 and 8, error is 94%
04.19.27 JOB 96 +Dropping samples 3 and 7, error is 147%
04.19.27 JOB 96 +Dropping samples 8 and 6, error is 134%
04.19.28 JOB 96 +Dropping samples 9 and 2, error is 148%
04.19.29 JOB 96 +Dropping samples 6 and 3, error is 60%
04.19.30 JOB 96 +ELAPSED time exceeds 128 seconds

Running any of the INSTRATE programs on a slow system like my 300 MHz laptop doesn't fail and samples are 9 down the line instead of varying all over the place.

If this is a hercules clock problem -

Does someone have a test to see how quickly the clock is incremented ?

Is the clock updated upon each STCK ? Should it be so that 2 consecutive STCK instructions don't get the same time ? Don't know how precise the clock is or how often it is updated.

Presume the TIMER is different from the CLOCK. Can it be used to check STCK elapsed time variances ? ie issue STCK, when TIMER pops, STCK again, compare timer interval to difference in clock fetches, loop for long duration to see average of results ? May be out in left field as my brain is about gone.

When incremented, is it by a chunk of time instead of a more granular interval ?

If it is INSTRATE's problem due to design (? IMO doubtful if it runs reliably on zPDT and Z Iron), can a simple addition of some fixed usecond be added to the test loop so that when the reference loop is subtracted it doesn't turn negative ?

Phil

ps - from memory again, it runs on my P/390 without samples varying all over the place - I'll dig that test result out.
kerravon86@yahoo.com.au [hercules-390]
2017-03-08 05:16:32 UTC
Permalink
Post by ***@hotmail.com [hercules-390]
If it is INSTRATE's problem due to design
(? IMO doubtful if it runs reliably on zPDT
and Z Iron),
I don't know what the rules for clocks are.
Post by ***@hotmail.com [hercules-390]
can a simple addition of some fixed
usecond be added to the test loop
so that when the reference loop is
subtracted it doesn't turn negative ?
If clock values are not guaranteed by
either z/Arch or Hercules, then the
program should be adjusted to see if
the elapsed is LESS than the reference
value, and if so, do NOT do the
subtraction, or perhaps set the elapsed
to 0 or 1.

BFN. Paul.
williaj@sympatico.ca [hercules-390]
2017-03-08 15:43:58 UTC
Permalink
Well PrOps says two consecutive STCK instructions will give unique values. The wording is a bit vague in the case where the extended clock facility is not installed. The manual implies that the cpu will add random bits to the right hand side of the returned value in order to insure uniqueness. It also mentions that the incrementing of the clock is comparable to the instruction execution rate of the cpu, but that seems improbable given the speed at which the clock updates.
Post by ***@hotmail.com [hercules-390]
If it is INSTRATE's problem due to design
(? IMO doubtful if it runs reliably on zPDT
and Z Iron),
I don't know what the rules for clocks are.


BFN. Paul.
'John P. Hartmann' jphartmann@gmail.com [hercules-390]
2017-03-08 15:54:41 UTC
Permalink
Post by ***@sympatico.ca [hercules-390]
Well PrOps says two consecutive STCK instructions will give unique values.
Correct; even for the CPUs of a configuration.

If the CPU runs out of values to store, it stalls until the next clock
tick. Hence STCKF.

STCKE is designed to take hand of this and also the year 2041 issue.
IBM is also aware that a thirty year lifespan for a system, while
perhaps unusual, is not unheard of (think 9020).
williaj@sympatico.ca [hercules-390]
2017-03-08 15:56:26 UTC
Permalink
Hercules is correct, insofar as STCK in a row. The result is:

D235B262 CCFFA003 First STCK
D235B262 CCFFB003 Second STCK

---In hercules-***@yahoogroups.com, <***@...> wrote :


Well PrOps says two consecutive STCK instructions will give unique values. The wording is a bit vague in the case where the extended clock facility is not installed. The manual implies that the cpu will add random bits to the right hand side of the returned value in order to insure uniqueness. It also mentions that the incrementing of the clock is comparable to the instruction execution rate of the cpu, but that seems improbable given the speed at which the clock updates.
Post by ***@hotmail.com [hercules-390]
If it is INSTRATE's problem due to design
(? IMO doubtful if it runs reliably on zPDT
and Z Iron),
I don't know what the rules for clocks are.


BFN. Paul.
'John P. Hartmann' jphartmann@gmail.com [hercules-390]
2017-03-08 16:03:37 UTC
Permalink
Post by ***@sympatico.ca [hercules-390]
D235B262 CCFFA003 First STCK
D235B262 CCFFB003 Second STCK
That is rather hamfisted. I assume the last digit is the CPU address
(in line with what real iron does/did), but Hecules is free to use the
bits beyond 51 for additional precision. It could also make them into a
counter for the number of values stored within a microsecond (and with
that it does not need the CPU address to make the values
unique--particularly important if you define 128 CPUs).
williaj@sympatico.ca [hercules-390]
2017-03-08 17:13:45 UTC
Permalink
I'd forgotten about STCKF. PrOps says two in a row could be the same. I don't know if it's chance or not, but they are different in my quick test:

D235C384 3C64D000 First STCKF
D235C384 3C64E000 Second STCKF
D235B262 CCFFA003 First STCK
D235B262 CCFFB003 Second STCK
That is rather hamfisted. I assume the last digit is the CPU address
(in line with what real iron does/did), but Hecules is free to use the
bits beyond 51 for additional precision. It could also make them into a
counter for the number of values stored within a microsecond (and with
that it does not need the CPU address to make the values
unique--particularly important if you define 128 CPUs).
'John P. Hartmann' jphartmann@gmail.com [hercules-390]
2017-03-08 17:21:07 UTC
Permalink
Post by ***@sympatico.ca [hercules-390]
I'd forgotten about STCKF. PrOps says two in a row could be the same. I
D235C384 3C64D000 First STCKF
D235C384 3C64E000 Second STCKF
Please do three in a row. Open an issue if all three are different and
your processor has at least a gigahertz cycle time.
williaj@sympatico.ca [hercules-390]
2017-03-08 18:17:26 UTC
Permalink
Here's four STCKF in a row:

D235D186 6696E000
D235D186 6696F000
D235D186 66970000
D235D186 66971000
Post by ***@sympatico.ca [hercules-390]
I'd forgotten about STCKF. PrOps says two in a row could be the same. I
D235C384 3C64D000 First STCKF
D235C384 3C64E000 Second STCKF
Please do three in a row. Open an issue if all three are different and
your processor has at least a gigahertz cycle time.
Tony Harminc tharminc@gmail.com [hercules-390]
2017-03-08 21:35:47 UTC
Permalink
Post by ***@sympatico.ca [hercules-390]
D235D186 6696E000
D235D186 6696F000
D235D186 66970000
D235D186 66971000
necessarily the same", not that they will.
No, it says almost the opposite - they "do not necessarily return different
values".

Is it really an error then that Hercules returns unique values ?
It's certainly not an error in terms of the Principles of Operation.
Whether it's worth while in Hercules to differentiate between STCK and
STCKF in an attempt to make STCKF actually run faster, I'm not sure.

Tony H.
'Mark L. Gaubatz' mgaubatz@groupgw.com [hercules-390]
2017-03-09 01:18:02 UTC
Permalink
Post by ***@sympatico.ca [hercules-390]
D235D186 6696E000
D235D186 6696F000
D235D186 66970000
D235D186 66971000
necessarily the same", not that they will. Is it really an error then
that Hercules returns unique values ?
While the following discusses both the clocks and the INSTRATE program,
please understand that the concepts of both the INSTRATE program and
instruction timing are neither an issue or at issue. The key is that
some of the fundamental concepts used for timing must be changed to
generate what can be considered proper results -- regardless of the
timing program in use.

Background: I have been working with the architecture and instruction
timing as a hardware vendor, software vendor, and ISV, for over 45
years; I was also part of a performance analysis team as an SE/PSR for
four years (including stepping through customer issues with the
performance of specific instructions).

Now down to the points:

1) Hercules is not in error in providing timing services by the
methodology in use; it is within the guidelines of the Principles of
Operation. What can be observed is properly termed as "model dependent
behavior." If one knows what to look for, these variations can be seen
between each model of each vendor's machines -- including clock behavior.

2) On Hyperion, the clock values are more "refined." For STCKF,
resolution is significantly better; on very rare occasions I have seen
the "same" clock value (primarily because I wrote and debugged much of
the core clock updates on Hercules and Hyperion, except steering). All
accuracy past the underlying OS clock calls is calculated based on
engine timings of the OS and will always somewhat lag the true actual
time (but always within one OS clock unit for the underlying OS). On my
Linux machine (with 1ns timing and an OS clock only guaranteed to
produce valid millisecond values) the STCKF values for one back-to-back
sequence ran as follows:

D234F481 8C891899
D234F481 8C893049 17B0 1480ns
D234F481 8C8930D9 90 35ns
D234F481 8C893169 90 35ns
D234F481 8C8938A9 740 453ns
D234F481 8C893939 90 35ns
D234F481 8C894006 6CD 425ns
D234F481 8C894096 90 35ns

This shows that the "real" underlying OS clock updated three, four or
five times during the sequence, and the Hercules CPU ran during the
sequence without interruption. The Hercules clock code during this
sequence was triggered not only by the STCKF, but also by various other
Hercules internal functions. It is less expensive in overhead to run a
single actual Hercules clock, with STCKF then *not* checking for
duplicate values.

3) Using the STORE CLOCK facility is, and always has been, the wrong
approach to instruction timing, unless one is running on a non-shared
(true dedicated) CPU *and* guaranteed NOT to take an interrupt. LPARs,
even with "dedicated" CPUs are not necessarily fully dedicated to only
working with a given LPAR -- cycles may be used by the LPAR
scheduler/dispatcher/etc. This has not changed in the 45+ years of my
working with the architecture.

Sidebar: On the 360, one could only use the Interval Timer as a clock
source. As such, it was a common practice (though not ethical) of
various representatives of multiple companies to surreptitiously disable
the Interval Timer using the Disable Interval Timer toggle switch during
timing runs. As SE's and CE's, we were trained in how to spot such
activities...

4) Running on "REAL" hardware, one must use a dedicated CPU in an LPAR.
It may not be defined as SHARED, and even then, the LPAR is not
guaranteed to get all available CPU cycles without interruption.

5) On real hardware, one must also take the caches, buffers, etc., into
account. On emulators, these and other considerations must be made,
including what the hosting OS is in use.

6) Hercules, as well as any other emulator running as either a user or
system application, and without a CPU fully dedicated with all maskable
interrupts masked off, exacerbates the issue. If and when "consistency"
is seen, it is generally due to the lack of interrupts.

7) That said, there is a way to get the proper, or closest to the
proper, results. Use the CPU Timer (measures time consumed, rather than
time elapsed). This means under a given OS where the timing job is
running, one must use other facilities to properly set and use the
information from the task control blocks; run in supervisor state with
interrupt facilities turned off, or "standalone."

8) Should one run in a virtualized operating system under Hercules, even
the emulated CPU Timer behaves better than the TOD clock, yielding
results that are normally well within a percentage point of "actual" on
extended runs.

9) The CPU Timer resolution is that of what the underlying OS provides.
So on my Linux system, that means sub-nanosecond resolution in TOD Clock
format:

FFFFFFF1 8345FB00 500 312.5ns
FFFFFFF1 8345E600 600 375.0ns
FFFFFFF1 8345E000 500 312.5ns
FFFFFFF1 8345DB00 600 375.0ns
FFFFFFF1 8345D500 500 312.5ns
FFFFFFF1 8345D000 600 375.0ns
FFFFFFF1 8345CA00 500 312.5ns
FFFFFFF1 8345C500

As such, the CPU Timer properly reflects the ACTUAL CPU time used by the
CPU thread (or underlying processor), and with consistency not possible
with the TOD Clock, regardless of whatever machinations are
algorithmically done. It should also be noted that the clock logic used
for the CPU Timer cannot be used for the TOD Clock, just as with a
"real" mainframe.

10) The timing sequences within INSTRATE are too short to properly
negate the branch loop overhead with consistency.

Mark
Ivan Warren ivan@vmfacility.fr [hercules-390]
2017-03-09 09:19:39 UTC
Permalink
Post by 'Mark L. Gaubatz' ***@groupgw.com [hercules-390]
10) The timing sequences within INSTRATE are too short to properly
negate the branch loop overhead with consistency.
I remember when I was doing my own "measurements", I would usually do :

Setup the clock comparator (but I should have used the CPU timer)

Setup a loop counter

:loopstart

Execute the same instruction (like 512 times in a row)

Check if the timer had expired (if so, out of the loop)

increment loop counter

jump back to :loopstart

As Mark said, especially under hercules, a branch that actually does
branch is a little heavy (it involves exiting the CPU execution loop
using longjmp(), checking the IA is valid, re-fetching a few instruction
in the hercules 'instruction cache'.. and some other housekeeping
work)... I'm sure there is something like this involved on other
implementations [1].... But issuing the "tested" instruction as a batch
helps reduce the impact of the actual branching to form the loop.

It wasn't perfect, but yielded fairly consistent results.

--Ivan

[1] And a bunch of other Instruction Architectures... That's the purpose
of the C construct _likely()/_unlikely()... because it's always/usually
cheaper to not take a branch than it is to take a branch[2].
[2] Now modern architecture will also implement branch prediction and
branch reach- ahead, where the CPU executes both branch path in parallel
using register aliases and then commit the path taken while rolling back
the path not taken... But these function are heavily hardware assisted.
Doing that on hercules would probably cost more than anything.


[Non-text portions of this message have been removed]
opplr@hotmail.com [hercules-390]
2017-03-09 14:51:06 UTC
Permalink
Mark wrote: ( i've paraphrased some ) Bottom of post for Jean-Louis:

Not meant to be rebuttal, just further understanding ( which I may forget this afternoon ).

1. ... "model dependent behavior."

yep, remember something about different models incrementing the clock by different values

2. ...the STCKF values for one back-to-back sequence ran as follows:

D234F481 8C891899
D234F481 8C893049 17B0 1480ns
D234F481 8C8930D9 90 35ns
D234F481 8C893169 90 35ns
D234F481 8C8938A9 740 453ns
D234F481 8C893939 90 35ns
D234F481 8C894006 6CD 425ns
D234F481 8C894096 90 35ns"

So the 35ns increments are generally expected and the 4xx jumps are due to other overhead of the host OS ?

3, 7, 8, 9. Using the STORE CLOCK facility is, and always has been, the wrong approach to instruction timing,

So STIMER set for say 30 seconds, run loop, TTIMER to calculate elpased time ?

4, 5, 6 Real Hardware dedicated CPU, caches, buffers, ....

Running of INSTRATE on the P/390 here produced an almost pristine sample 9 for the whole run. 3 runs gave 1 or 2 instances of 11 samples but no higher.

Same program running on older 300MHz W2K hercules 3.08 had widely varying samples.

10. The timing sequences within INSTRATE are too short to properly negate the branch loop overhead with consistency.

Which I think is a shame. He put some mechanisms in place to provide for longer test loops. I can't remember if it overcomes the problem (?).

Adjusting Replicate to 5 and Tolerate to 30 gives complete run of instrate on my FX-8320 3.5 GHz 8 Core with only the BCT Rx,loop getting 11 samples, all others are 9. TOP shows 113 % CPU


14.39.33 JOB 105 + General instruction tests
14.39.33 JOB 105 + Samples = 9
14.39.33 JOB 105 + Loops = 1000000
14.39.33 JOB 105 + Replicate = 5
14.39.33 JOB 105 + Tolerate% = 30
14.39.33 JOB 105 + Pgm addr = 000A5000
14.39.33 JOB 105 + DATA addr = 000A5A38
14.39.33 JOB 105 + DATAB addr = 000A63F8
14.39.33 JOB 105 + DATAP addr = 000A5FF8
14.39.33 JOB 105 + Instruction addr =000A70148
14.39.33 JOB 105 +Description MIPS nS Samples
14.39.33 JOB 105 +BCT Rx,loop 321.33 3 11
14.39.34 JOB 105 +BCTR Rx,Rloop (refference loop) 73.56 13 9
14.39.35 JOB 105 +NOP R0 149.31 6 9
14.39.35 JOB 105 +Fetches:
14.39.36 JOB 105 +LR R1,R0 113.92 8 9
14.39.37 JOB 105 +LTR R1,R0 96.71 10 9
14.39.40 JOB 105 +L R1,0 19.97 51 9
14.39.41 JOB 105 +L R1,DATA 64.85 15 9
14.39.42 JOB 105 +L R1,DATA+1 63.87 16 9
14.39.43 JOB 105 +LH R1,DATA 59.89 17 9
14.39.45 JOB 105 +ICM R1,15,DATA 62.24 16 9
14.39.46 JOB 105 +ICM R1,1,DATA 37.51 27 9
14.39.47 JOB 105 +IC R1,DATA 57.72 17 9
14.39.49 JOB 105 +LD F0,DATA 54.22 18 9
14.39.51 JOB 105 +LM 8,6,SAVEREGS+(8*4) 27.23 37 9
14.39.51 JOB 105 +Stores:
14.39.53 JOB 105 +STM 1,14,DATA 29.91 34 9
14.39.54 JOB 105 +ST R1,DATA 63.24 16 9
.... clipped

Phil
'Mark L. Gaubatz' mgaubatz@groupgw.com [hercules-390]
2017-03-09 17:02:34 UTC
Permalink
Post by 'Mark L. Gaubatz' ***@groupgw.com [hercules-390]
D234F481 8C891899
D234F481 8C893049 17B0 1480ns
D234F481 8C8930D9 90 35ns
D234F481 8C893169 90 35ns
D234F481 8C8938A9 740 453ns
D234F481 8C893939 90 35ns
D234F481 8C894006 6CD 425ns
D234F481 8C894096 90 35ns"
So the 35ns increments are generally expected and the 4xx jumps are
due to other overhead of the host OS ?
Make that a variety of activities, not just that of the host OS.
Post by 'Mark L. Gaubatz' ***@groupgw.com [hercules-390]
3, 7, 8, 9. Using the STORE CLOCK facility is, and always has been,
the wrong approach to instruction timing,
So STIMER set for say 30 seconds, run loop, TTIMER to calculate elpased time ?
No on STIMER, it is TOD clock based, and the interrupt mechanism is both
too loose and too long as well. TTIMER is based on STIMER.
Post by 'Mark L. Gaubatz' ***@groupgw.com [hercules-390]
4, 5, 6 Real Hardware dedicated CPU, caches, buffers, ....
Running of INSTRATE on the P/390 here produced an almost pristine
sample 9 for the whole run. 3 runs gave 1 or 2 instances of 11
samples but no higher.
P/390 and the R/390 use a dedicated CPU card, that is why the more
consistent results on the platform.
Post by 'Mark L. Gaubatz' ***@groupgw.com [hercules-390]
10. The timing sequences within INSTRATE are too short to properly
negate the branch loop overhead with consistency.
Which I think is a shame. He put some mechanisms in place to provide
for longer test loops. I can't remember if it overcomes the problem (?).
More than the bit that's been done so far needs to be reworked; you have
to take a look at the hardware architecture and instruction set emulator
architecture. Tweaking based on the current knobs makes for "consistent"
results on your machine, but not on others. This further indicates that
there is a flaw in the current fundamental design of the testing
Post by 'Mark L. Gaubatz' ***@groupgw.com [hercules-390]
Adjusting Replicate to 5 and Tolerate to 30 gives complete run of
instrate on my FX-8320 3.5 GHz 8 Core with only the BCT Rx,loop
getting 11 samples, all others are 9. TOP shows 113 % CPU
Mark
opplr@hotmail.com [hercules-390]
2017-03-10 03:01:04 UTC
Permalink
Mark wrote:

"No on STIMER, it is TOD clock based, and the interrupt mechanism is both too loose and too long as well. TTIMER is based on STIMER."

Earlier you wrote:

"The CPU Timer resolution is that of what the underlying OS provides. So on my Linux system, that means sub-nanosecond resolution in TOD Clock format:

FFFFFFF1 8345FB00 500 312.5ns
FFFFFFF1 8345E600 600 375.0ns
FFFFFFF1 8345E000 500 312.5ns
FFFFFFF1 8345DB00 600 375.0ns
FFFFFFF1 8345D500 500 312.5ns
FFFFFFF1 8345D000 600 375.0ns
FFFFFFF1 8345CA00 500 312.5ns
FFFFFFF1 8345C500"

I get easily confused about us vs ns. You mention sub-nanosecond resolution but 312.5 ns ? Should that be 312.5 us ?

Does CPU Timer exist in MVS 3.8 J environment ?

While FX-8320 is probably left in the dust of I5s and I7 and now AMD REYZAN (?), there is still a need for instruction based timing comparison when changes are made over time to hercules source. Ivan used to do it on a daily basis and if a change affected some or a group of instructions it became apparent.

Who's minding the store now ?

Phil

ps - adding all the 370 instructions except MC to instrate-debug.txt took a little while - a few have setup instructions for which I didn't subtract the time of the setups (yet)
'Mark L. Gaubatz' mgaubatz@groupgw.com [hercules-390]
2017-03-10 06:13:35 UTC
Permalink
Post by ***@hotmail.com [hercules-390]
"No on STIMER, it is TOD clock based, and the interrupt mechanism is
both too loose and too long as well. TTIMER is based on STIMER."
"The CPU Timer resolution is that of what the underlying OS provides.
So on my Linux system, that means sub-nanosecond resolution in TOD
Better phrasing might be, "The CPU Timer resolution is that of what the
underlying OS provides. So on my Linux system, that means that
sub-nanosecond resolution due to the mathematics involved in converting
from nanoseconds to TOD Clock format."

Sidebar: Most of today's system hardware internal timers can be
calculated to sub-nanosecond resolution; presentation by the OS is
nanosecond based (after all, per the presentation of nanosecond clocks
for Linux, there's no practical need for higher resolution timing [and
both IBM and Amdahl mainframes were already running sub-nanosecond TOD
clocks] -- yet the now several year old specification for C++11 supports
attosecond clocks, which is required for "easy" TOD Clock Extended
format 128-bit calculations).
Post by ***@hotmail.com [hercules-390]
I get easily confused about us vs ns. You mention sub-nanosecond
resolution but 312.5 ns ? Should that be 312.5 us ?
TOD Clock format resolution (mathematically in bit 63) is 1/4096
microseconds, or 244.140625 picoseconds, with bit 51 as one microsecond.
Consequently, when the underlying OS system presentation of the clocks
or timers are in nanoseconds, one must actually calculate to a higher
precision for the best presentation of the information available. This
issue is along the same lines as working with binary floating point
fractional digits, one must use resolution greater than provided to
present the requested decimal resolution.
Post by ***@hotmail.com [hercules-390]
Does CPU Timer exist in MVS 3.8 J environment ?
Yes. It is used by ALL commercially released operating systems fully
supporting the base S/370 architecture (there are some which were
strictly S/360 based and never updated to actually use the S/370
features); the relative information is available via system calls and
the control blocks for each task.

Mark
Tony Harminc tharminc@gmail.com [hercules-390]
2017-03-10 16:30:15 UTC
Permalink
Post by ***@hotmail.com [hercules-390]
Does CPU Timer exist in MVS 3.8 J environment ?
Yes. It is used by ALL commercially released operating systems fully
supporting the base S/370 architecture (there are some which were strictly
S/360 based and never updated to actually use the S/370 features); the
relative information is available via system calls and the control blocks
for each task.
There is a bit of a twist on this. MVS 2.0 (i.e. the first release) used
the TOD clock for CPU timing rather than the CPU timer. Somewhere soon
thereafter (3.0, IIRC) they switched to using the CPU timer, as they should
have all along. I speculate that this early avoidance of the CPU timer was
because the 370/145 (the smallest 370 on which running MVS was potentially
practical) had DAT as a standard feature, but the CPU timer and clock
comparator were an optional feature. But someone who was there at the time
may know the true reason.

Tony H.

'Jean-Louis Noel' jln@stben.net [hercules-390]
2017-03-08 12:24:34 UTC
Permalink
Hi Phil,
Post by ***@hotmail.com [hercules-390]
It may be necessary to adjust Toleration percentage - a lot.
If you increase it samples will be varying all over the place as you said.

I used INSTRATE packed as jobs.zip under MIPs Testing from the list file
area.
I commented out the NOP testing and left everything else as provided and it
worked:

- 11.22.31 JOB02564 $HASP373 INSTRATE STARTED - INIT 1 - CLASS A -
SYS
- SYS1
- 11.22.31 JOB02564 +INSTRATE STARTED
- 11.22.31 JOB02564 + intel i7, z/os 1.10, 4.00
- 11.22.31 JOB02564 + General instruction tests
- 11.22.31 JOB02564 + Samples = 9
- 11.22.31 JOB02564 + Loops = 1000000
- 11.22.31 JOB02564 + Replicate = 1
- 11.22.31 JOB02564 + Tolerate% = 10
- 11.22.31 JOB02564 + Pgm addr = 00007000
- 11.22.31 JOB02564 + DATA addr = 00007A38
- 11.22.31 JOB02564 + DATAB addr = 000083F8
- 11.22.31 JOB02564 + DATAP addr = 00007FF8
- 11.22.31 JOB02564 + Instruction addr =000090148
- 11.22.31 JOB02564 +Description MIPS nS
Samples
- 11.22.33 JOB02564 +BCT Rx,loop 63.25 16
31
- 11.22.34 JOB02564 +BCTR Rx,Rloop (refference loop) 62.79 16
9
- 11.22.34 JOB02564 +Fetches:
- 11.22.34 JOB02564 +LR R1,R0 54.56 18
13
- 11.22.35 JOB02564 +LTR R1,R0 45.29 22
11
...
- 11.24.13 JOB02564 +Miscellaneous:
- 11.24.14 JOB02564 +Move 256 bytes by MVC 18.42 55
19
- 11.24.20 JOB02564 +Move 256 bytes by MVCL 11.08 92
41
- 11.24.27 JOB02564 +Clear 256 bytes by MVC 2.32 439
13
00- 11.24.31 JOB02564 +Clear 256 bytes by MVCL 11.36 90
29
- 11.24.31 JOB02564 +INSTRATE FINISHED
- 11.24.31 JOB02564 IEF404I INSTRATE - ENDED - TIME=11.24.31

A bit later to check the confidence:
- 12.59.57 JOB02587 $HASP373 INSTRATE STARTED - INIT 1 - CLASS A -
SYS
- SYS1
- 12.59.57 JOB02587 +INSTRATE STARTED
- 12.59.57 JOB02587 + intel i7, z/os 1.10, 4.00
- 12.59.57 JOB02587 + General instruction tests
- 12.59.57 JOB02587 + Samples = 9
- 12.59.57 JOB02587 + Loops = 1000000
- 12.59.57 JOB02587 + Replicate = 1
- 12.59.57 JOB02587 + Tolerate% = 10
- 12.59.57 JOB02587 + Pgm addr = 00007000
- 12.59.57 JOB02587 + DATA addr = 00007A38
- 12.59.57 JOB02587 + DATAB addr = 000083F8
- 12.59.57 JOB02587 + DATAP addr = 00007FF8
- 12.59.57 JOB02587 + Instruction addr =000090148
- 12.59.57 JOB02587 +Description MIPS nS
Samples
- 12.59.58 JOB02587 +BCT Rx,loop 63.71 16
19
- 12.59.59 JOB02587 +BCTR Rx,Rloop (refference loop) 62.32 16
11
- 12.59.59 JOB02587 +Fetches:
- 12.59.59 JOB02587 +LR R1,R0 51.78 19
9
- 13.00.01 JOB02587 +LTR R1,R0 49.00 20
17
...
- 13.01.33 JOB02587 +Miscellaneous:
- 13.01.34 JOB02587 +Move 256 bytes by MVC 18.21 56
11
- 13.01.39 JOB02587 +Move 256 bytes by MVCL 11.04 92
35
- 13.01.47 JOB02587 +Clear 256 bytes by MVC 2.24 456
17
00- 13.01.50 JOB02587 +Clear 256 bytes by MVCL 11.28 90
19
- 13.01.50 JOB02587 +INSTRATE FINISHED
- 13.01.50 JOB02587 IEF404I INSTRATE - ENDED - TIME=13.01.50

Now, it remains to understand what happens during the execution of a NOP
instruction by Hercules.
I can't remember how much time it takes for an iron to execute a NOP
operation.
I had it but those three rings binders are gone for ages now.

- Jean-Louis
Ivan Warren ivan@vmfacility.fr [hercules-390]
2017-03-08 13:50:58 UTC
Permalink
On 3/8/2017 1:24 PM, 'Jean-Louis Noel' ***@stben.net [hercules-390] wrote:
<...>
Post by 'Jean-Louis Noel' ***@stben.net [hercules-390]
Now, it remains to understand what happens during the execution of a NOP
instruction by Hercules.
I can't remember how much time it takes for an iron to execute a NOP
operation.
I had it but those three rings binders are gone for ages now.
There is no 'NOP' instruction per se. NOP is implemented as a Branch
Never (BC 0,addr or BCR 0,Rx).

In S/370, The only "NOP" instruction which has a side effect is "BCR
15,0" (Always branch to the next instruction) which is used in
Multiprocessor setup to perform a CPU Serialization operation (Flush all
caches and return from the instruction when all CPUs have reached the
next instruction step), and a checkpoint synchronization (no effect on
hercules since a checkpoint synchronization is used for instruction
retry in case of machine check).

There is another variant with BCR 14,0 (The fast BCR Serialization only
facility).

In hercules, BC 0,xxx and BCR 0,x have a very short path length.
basically, the cpu execution code fetches the instruction, goes to the
decode routine and calls the branch instruction which returns
immediately when seeing the mask is 0 leading the CPU execution loop to
go directly to the next instruction. On a x86 processor, it takes around
10 instructions to do this. (The instructions are prefetched, the ILC
doesn't need updating because the instruction cannot generate an
interrupt[1], and instruction are only counted by batches of 7 in the
decode loop).

--Ivan

[1] Except when PER, instruction tracing or stepping is active, but then
another version of the loop is used.



[Non-text portions of this message have been removed]
'John P. Hartmann' jphartmann@gmail.com [hercules-390]
2017-03-08 14:26:23 UTC
Permalink
Nevertheless, the assembler provides the extended mnemonics NOP, NOPR,
and JNOP. So I'd say there is a NOP instruction.

As far as the hardware is concerned there is neither BC nor NOP; it
understands only operation code x'47'.
Post by Ivan Warren ***@vmfacility.fr [hercules-390]
There is no 'NOP' instruction per se. NOP is implemented as a Branch
Never (BC 0,addr or BCR 0,Rx).
------------------------------------

------------------------------------

Community email addresses:
Post message: hercules-***@yahoogroups.com
Subscribe: hercules-390-***@yahoogroups.com
Unsubscribe: hercules-390-***@yahoogroups.com
List owner: hercules-390-***@yahoogroups.com

Files and archives at:
http://groups.yahoo.com/group/hercules-390

Get the latest version of Hercules from:
http://www.hercules-390.org


------------------------------------

Yahoo Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/hercules-390/

<*> Your email settings:
Individual Email | Traditional

<*> To change settings online go to:
http://groups.yahoo.com/group/hercules-390/join
(Yahoo! ID required)

<*> To change settings via email:
hercules-390-***@yahoogroups.com
hercules-390-***@yahoogroups.com

<*> To unsubscribe from this group, send an email to:
hercules-390-***@yahoogroups.com

<*> Your use of Yahoo Groups is subject to:
https://info.yahoo.com/legal/us/yahoo/utos/terms/
Ivan Warren ivan@vmfacility.fr [hercules-390]
2017-03-08 16:12:31 UTC
Permalink
Post by 'John P. Hartmann' ***@gmail.com [hercules-390]
Nevertheless, the assembler provides the extended mnemonics NOP, NOPR,
and JNOP. So I'd say there is a NOP instruction.
As far as the hardware is concerned there is neither BC nor NOP; it
understands only operation code x'47'.
Yeah yeah...

I am just saying there is NO specific NO-OP opcode, just assembler
instructions that turn into branches coded to never actually branch.

--Ivan





[Non-text portions of this message have been removed]
'John P. Hartmann' jphartmann@gmail.com [hercules-390]
2017-03-08 17:09:39 UTC
Permalink
Post by Ivan Warren ***@vmfacility.fr [hercules-390]
I am just saying there is NO specific NO-OP opcode, just assembler
instructions that turn into branches coded to never actually branch.
Dear Ivan:

The instruction stream is a bunch of bits as far as the hardware is
concerned. There are no mnemonics at that level. It is a fallacy to
say that B is and operation code and NOP isn't. Sure one of them is an
extended operation code; that is, something that could be expressed with
more keystrokes using another one. And DC x'0700' is just another
longwinded way of saying NOPR.

And what about X x,x? It invokes the clear storage instruction on real
iron; only when the real address of the operands differ, will the
contents of them pass through the ALU. Is that a separate instruction?


------------------------------------

------------------------------------

Community email addresses:
Post message: hercules-***@yahoogroups.com
Subscribe: hercules-390-***@yahoogroups.com
Unsubscribe: hercules-390-***@yahoogroups.com
List owner: hercules-390-***@yahoogroups.com

Files and archives at:
http://groups.yahoo.com/group/hercules-390

Get the latest version of Hercules from:
http://www.hercules-390.org


------------------------------------

Yahoo Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/hercules-390/

<*> Your email settings:
Individual Email | Traditional

<*> To change settings online go to:
http://groups.yahoo.com/group/hercules-390/join
(Yahoo! ID required)

<*> To change settings via email:
hercules-390-***@yahoogroups.com
hercules-390-***@yahoogroups.com

<*> To unsubscribe from this group, send an email to:
hercules-390-***@yahoogroups.com

<*> Your use of Yahoo Groups is subject to:
https://info.yahoo.com/legal/us/yahoo/utos/terms/
Ivan Warren ivan@vmfacility.fr [hercules-390]
2017-03-08 18:09:49 UTC
Permalink
Post by 'John P. Hartmann' ***@gmail.com [hercules-390]
Post by Ivan Warren ***@vmfacility.fr [hercules-390]
I am just saying there is NO specific NO-OP opcode, just assembler
instructions that turn into branches coded to never actually branch.
The instruction stream is a bunch of bits as far as the hardware is
concerned. There are no mnemonics at that level. It is a fallacy to
say that B is and operation code and NOP isn't. Sure one of them is an
extended operation code; that is, something that could be expressed with
more keystrokes using another one. And DC x'0700' is just another
longwinded way of saying NOPR.
And what about X x,x? It invokes the clear storage instruction on real
iron; only when the real address of the operands differ, will the
contents of them pass through the ALU. Is that a separate instruction?
Never mind.

I am just stating there is no single opcode designed to do *nothing*...

--Ivan



[Non-text portions of this message have been removed]
Loading...