Performance observations please

Discussion:

Performance observations please

Adam Thornton

2002-11-19 18:05:07 UTC

I ran a port our uni-Rexx product. Normally, under a similar
Linux/Intel system, the port and the corresponding QA backend would
take less than 15 minutes wall time. My run last night consumed
about 10 hours running SuSE 7.0 S/390 within my Hercules
environment. THis is a big difference although I fully recognize
that I have managed to avoid buying an expensive mainframe with a
system worth about $100 and a copy of Suse 8.0 that cost me about
$40. TO a large degree I got what I paid for.

I find that Hercules emulation tends to cost me about two orders of
magnitude of speed compared to a similar task running "on the metal".
That is, given task X in Linux/390 under Hercules, or under Linux/x86 on
the Hercules host, there's roughly a factor of 100 in terms of
performance difference. So it seems like you're in my ballpark.

Adam

--
adam-uX/***@public.gmane.org
"My eyes say their prayers to her / Sailors ring her bell / Like a moth
mistakes a light bulb / For the moon and goes to hell." -- Tom Waits

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Share the magic of Harry Potter with Yahoo! Messenger
http://us.click.yahoo.com/4Q_cgB/JmBFAA/46VHAA/W4wwlB/TM
---------------------------------------------------------------------~->

Joseph M. DeAngelo

2002-11-19 17:59:27 UTC

I'd like to know if anyone has done performance analyses on typical
Hercules environments, i.e. running under an Intel based host. Are
there performance related how-to's located somewhere? I'm new to
this and would like to share my observations.

First allow me to congratulate the Hercules team on a job very well
done. As a veteran of running DOS and VS1 under VM from the early
70's, I can see right away that this is a big step forward from that
sort of thing. When I could hit the enter key on a telnet session to
the S/390 Linux rnuning under hercules and see it come back with an
immediate prompt, I was impressed. CMS running under VM running
under VM was no where near as fast.

I could also see that any command which involved I/O to dasd was
suffering. As someone who has written operating systems for
mainframes long ago, I can testify to the enormous number (thousands)
of instructions needed to translate any I/O request into a channel
program and run that program to a successful conclusion. I saw
someone else estimate that it takes an average of 200 native
instructions to emulate a single S/390 instruction. It would seem
that the greatest gain in performance would be obtained in cutting
out that log jam.

I am running Hercules on a 256M K6-2/400 AMD processor running Suse 8
Linux. I chose that environment because I feel that the Linux/Intel
kernel is more efficient than the Win2K. My Hercules environment
consists of a 128M machine with a 1G dasd to house the / filesystem,
a 1.8G drive to house /home, and a 200M dasd for the swap. I realize
that this is a very unimpressive initial hardware situation.

I ran a port our uni-Rexx product. Normally, under a similar
Linux/Intel system, the port and the corresponding QA backend would
take less than 15 minutes wall time. My run last night consumed
about 10 hours running SuSE 7.0 S/390 within my Hercules
environment. THis is a big difference although I fully recognize
that I have managed to avoid buying an expensive mainframe with a
system worth about $100 and a copy of Suse 8.0 that cost me about
$40. TO a large degree I got what I paid for.

I would appreciate any performance tips. I know I should buy a real
computer for starters.

I'd also like to expand on my observation concerning I/O perforance.
I've been a VM person since CP/67 was offerred by Boeing Systems as a
timesharing environment. As early as VM/370, this attempt to emulate
I/O was recognized as a significant bottleneck. IBM came out with an
enhancement for running OS/VS1 under VM that required OS/VS1 to
recognize that it was running in a virtual machine ( via a quirk in
the Store CPUID instruction). OS/VS1 would then send page I/O
directly to the VM hypervisor via a Diagnose instruction. This
eliminated the construction of CCW programs and their subsequent
decoding by VM.

I wonder if Hercules does the same or a similar trick with the Store
CPUID instruction that would permit it's guests to know that they are
running under Hercules which would allow tha DIAG instruction to be
used for a similar purpose.

When my SuSE 7.0 S/390 wants to read data from dasd, if it knows that
it is running on an Intel linux based host, the request could be more
efficiently translated, i.e. an fread() in the S/390 system could
conceiveably be translated into a diagnose instruction to the
Hercules hypervisor which, in turn, would have the mapping data
needed to satisfy the I/O request with its own direct fread() call.

Anyway, I recognize that my simplistic concept would require a lot
more work than my words might imply, but I think that it is the
truest path to enhanced Hercules performance.

Dan

2002-11-19 19:18:37 UTC

Probably the most dramatic difference between the mainframe hardware
architecture and, say a typicaly "big-iron" UNIX machine architecture
is the I/O. To be more specific, mainframes tend to have a much
larger number of I/O devices connected to them, and the architecture
supports this large number of connections in a topology that doesn't
have the bottlenecks that smaller machines would have.

To the extent you are running I/O bound programs, the mainframe can
support a vastly greater number of them running simultaneously due to
the fact that it has an enormous ability to parallelize I/O. If you
run 1000 simultaneous jobs, each of which uses a dedicated DASD unit,
assuming they are all fairly-well I/O bound, each will complete in
about the time it would have if nothing else were happening on the
system. This is due to the highly parallel I/O architecture and the
highly efficient task switch.

That profile suggests a few things about emulating mainframes for
maximum benefit.

Firstly, the emulated mainframe is a lot like the real mainframe in
the sense that CPU cycles can be considered expensive. If we are
averaging 200 native cycles per mainframe instruction, mainframe
instructions are 200 times as expensive as Intel instructions. That's
probably just about in line with the hardware side. A mainframe
processor that could do as many instructions per second as, say, an
AMD processor, would cost about 200 times as much (by rough order
anyway).

So, in terms of CPU, you have the equivalent of a smallish mainframe
(say a P/390). Due to the highly efficient system architecture, and
I/O capabilities, a P/390 might easily be shared by 100 people if
they were all doing traditional MVS stuff (COBOL compiles, assembly
and link edit, using TSO, running batch jobs). Not bad for only 7
MIPS.

But the I/O picture is a lot more bleak. You might have 100 emulated
DASD units in your Hercules configuration, but chances are you are
only using a handful of real disk drives. That makes the disk drive a
serious bottleneck, especially considering that mainframe workloads
tend to utilize DASD a lot (since that's one of the best ways to get
scalability on a REAL mainframe).

We can take a lesson from how, say, a Multiprise 3000 handles this
situation. They have a real mainframe processor for the CPU, another
real mainframe processor for the channels, and an I/O system emulated
under OS/2. Between the P/390 and the MP/3000, IBM realized that the
OS/2 drivers and bus were a serious I/O bottleneck for DASD, so they
created a direct connection between the main DASD array and the S/390
side which is still controlled from OS/2, but the data itself doesn't
go through the PC bus, etc.

In order to get the kind of I/O performance needed to scale to the
level of the CPU capacity of that box, a high-performance, hardware
RAID disk subsystem is employed with a large, direct pipe to the
mainframe memory. It's an interesting approach really. You might have
50 emulated DASD units all sitting in storage on a big disk array
that has maybe 20 physical units. If the RAID is implemented well
(and I'm sure it probably is in this box), you should see something
approaching half of the performance of real mainframe DASD. Since the
control instructions are run offline of the actual mainframe CPU (by
a combination of the channel CPU and the Intel "driver" CPU), I/O can
still be parallelized a lot (but still less than in a "true"
mainframe).

So, my observations are:

A dual CPU box is a good idea. In fact, it would be a little better
to have two slower CPUs than one really fast one.

A high-performance disk system (with multiple disk arms, hardware
striping and caching, etc.) is also a good idea.

A ton of RAM isn't really necessary. Put your money into the disk
system and CPUs.

I am building a system like this:

Dual Athelon 2000
512 MB registered ECC RAM
Promise SX4 ATA RAID card
4 80 GB, 7200 RPM, ATA-133 Maxtor drives
Case with a big power supply and lots of cooling fans
UPS

The promise card is very cost effective, and has extremely good read
performance, but suffers a bit in write performance for RAID-5 due to
its slow processor for XOR calculation. This could be remedied by
going to a RAID-0 configuration, or 0+1, which would require 2 * (n -
1), or in my case 6, drives to get the same level of read
performance. I'm also putting 256MB ECC RAM on the promise card for
hardware disk caching.

With respect to how to configure the system, Linux seems like a
better way than Windows, because you have the extra layer of the
Cygwin stuff on Windows. My plan is to create several different Linux
file systems (in partitions/slices of the RAID storage), and divide
mainframe DASD files between them. That will have the effect of
reducing fragmentation and giving a finer granularity of locking in
cases where it is necessary to lock the volume to perform some
operation.

One more thing:

While running Linux/390 on an emulated mainframe is an interesting
exercise, I personally doubt it would have much practical value over
just running Linux directly on the hardware. Linux is designed for a
price/performance world where CPU cycles are 200 times cheaper than
they are in the mainframe world. The tradeoffs are all different. I
would check out MVS 3.8j, which is a public domain version that runs
great under Hercules, and performs actually quite acceptably as long
as you don't tax the I/O system too much (i.e. with only a few
concurrent users/jobs).

Regards,
--Dan

Post by Joseph M. DeAngelo
I'd like to know if anyone has done performance analyses on typical
Hercules environments, i.e. running under an Intel based host. Are
there performance related how-to's located somewhere? I'm new to
this and would like to share my observations.

(snip)

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/JjlUgA/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

John Summerfield

2002-11-20 02:27:11 UTC

Post by Dan
Probably the most dramatic difference between the mainframe hardware
architecture and, say a typicaly "big-iron" UNIX machine architecture
is the I/O. To be more specific, mainframes tend to have a much
larger number of I/O devices connected to them, and the architecture
supports this large number of connections in a topology that doesn't
have the bottlenecks that smaller machines would have.

Years ago we spend $19 000 000 on a pair of 3168 machines. CPU was about one
MIP, I/O was 1.5 Mbytes/sec one one BMX channel, aggregate 3.

In contrast, when I ran Herc/MVS a while ago on a PII 233, it was a few MIPS
and _much_ more I/O - equivalent about two generations later.

Nobody in their right mind will try to run the number of TSO users we did back
then. Today, licences permitting, you'd give everyone their own 10 MIPs or
more. If you want to synchronise stuff across a network, as I could recall,
JES3 could manage up to 32 systems.

Hercules on even cheap hardware (I recently bought a PII/266 system, 64 Mbyte
RAM) for $60 is _much_ more capable than the hardware people used to run MVS
3.8 on.

There are uses for Hercules, but if you're running a heavy production
workload, a real computer is a better bet.
--
Cheers
John Summerfield

Microsoft's most solid OS: http://www.geocities.com/rcwoolley/
Join the "Linux Support by Small Businesses" list at
http://mail.computerdatasafe.com.au/mailman/listinfo/lssb

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/JjlUgA/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

Dan

2002-11-20 03:28:52 UTC

Post by John Summerfield
In contrast, when I ran Herc/MVS a while ago on a PII 233, it was a few MIPS
and _much_ more I/O - equivalent about two generations later.

(snip)

Post by John Summerfield
Hercules on even cheap hardware (I recently bought a PII/266

system, 64 Mbyte

Post by John Summerfield
RAM) for $60 is _much_ more capable than the hardware people used to run MVS
3.8 on.

(snip)

That's interesting.

It would seem to me that there would still be a really big difference
in the level of I/O parallelization.

For example, if you had 10 batch jobs that were I/O bound (like most
batch jobs), and they could easily complete within your batch window,
it would be acceptable to run them using tape data sets. Supposing
you had enough tape drives (maybe 20-30), these jobs could run almost
without any impact to the rest of the system's performance. They
would be using very few CPU cycles per second, and all of their I/O
would be going across different channels than DASD or terminals, and
they would be employing highly inexpensive storage on devices that
ran totally independently of the rest of the system.

Same thing if you had lots of DASD units. With 3380/3980, two
independent data paths per string. If you had 5 strings, you could
have 10 jobs moving data along independent paths to independent units
at the same moment. Since this tends to be highly I/O bound work,
each has a very small resource consumption in main storage or CPU.

All of that I/O parallelization means doing a whole lot of work at
once, without the jobs really impacting one another very much in
terms of performance.

On a PC with, say, 4 fast hard drives on a RAID 0 controller with
caching, the highest transfer speed is much greater, but each disk
arm still has to seek and search for each request. With only 4 arms
across the data shared by a single controller that looks to the
operating system like a single storage device, requests to DASD are
largely serialized, and there can never be more than 4 physical disk
operations at a time.

The sheer number of I/O devices that can be connected to the
mainframe through independent paths greatly enhances its scalability.
To the extent it is I/O bound work, a few MIPS go a long way if
there's enough I/O capacity.

--Dan

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/JjlUgA/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

Greg Smith

2002-11-20 04:03:04 UTC

Post by Dan
To the extent it is I/O bound work, a few MIPS go a long way if
there's enough I/O capacity.

ckd dasd i/o is cached, esp cckd dasd i/o... to the extent that the majority
of ckd dasd i/o can complete `synchronously'. There are probably a lot
of messages in the archives concerning synchronous i/o. Basically,
track images are cached and if all the data referenced by a channel
program is already cached then i/o interrupt pending could occur before
the sio/ssch instruction has finished. My recollection is that we see some
80% disk i/o completed synchronously. This means that the sio/ssch
instruction has a longer path length but no a/s switching is required.
A linux/390 device driver error was exposed because we were actually
presenting i/o interrupts too fast!!

I've done a lot of research in this area too. Coding, benchmarks,
whatnot. Maybe we will take you a little more seriously if you can
provide reproducible testcases to demonstrate these performance problems.

Greg

Dan

2002-11-20 20:26:13 UTC

Post by Greg Smith
I've done a lot of research in this area too. Coding, benchmarks,
whatnot. Maybe we will take you a little more seriously if you can
provide reproducible testcases to demonstrate these performance problems.

I don't recall mentioning any performance problems. I meant to
observe that I/O parallelism, especially in DASD, is an important
consideration for emulating a mainframe on a PC. I think that's a
valid observation.

Massive I/O parallelization is necessary for masive scalability.
Caching improves the speed of accessing things that are accessed a
lot. It doesn't improve the speed of accessing things that are
accessed only rarely, and its ability to improve turnaround time is
inversely proportional to the number of tasks that are sharing the
cache, accessing different data. Caching is a good thing, but it
doensn't actually eliminate I/O bottlenecks. It merely reduces by
some factor the amount of I/O that is needed for some things. It
can't alter the physical reality that, in order to process a lot of
data, it is necessary to access a lot of data.

I don't know who you are referring to when you say 'we', since I
don't recall anyone mentioning not taking me seriously until you did
in your post. At any rate, it wasn't my intention to solicit for your
approval. I just enjoy making observations and discussing system
performance factors. I feel that there is a lot of benefit to be
gained by different individuals sharing their different insights
about things.

I suppose you've made it pretty clear in your post that you don't get
much value from my insights. That is as may be, but the point of a
forum for open discussion is the open and free sharing of ideas and
opinions. To that end, I intend to continue to contribute mine, and I
sincerely hope that you will also continue to contribute yours, since
I do find those valuable.

Best regards,
--Dan

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/JjlUgA/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

John Summerfield

2002-11-20 04:24:36 UTC

Post by Dan
It would seem to me that there would still be a really big difference
in the level of I/O parallelization.

No matter how you look at it, Herc on a PC isn't going to do a lot of parallel
I/O.

If you want real mainframe performance there's no substiture for a mainframe.

OTOH, Herc on PCs would make a pretty handy platform for programmers to do
their coding, compiling and some testing (licences permitting). If they need
serious I/O then use the big iron. Probably nothing technical would prevent
your personal mainframe from connecting to the corporate DB2 systems.

Similarly, it would be fine for learning about the latest version of DB2 UDB
for OS/390, but for performance evaluation, you still need a real computer.

btw Don't think you're going to cure I/O problems with a bunch of IDE drives.
You can only drive one at a time on each IDE port. And, if you have a mess of
ribbon in the box, ventilation is going to be a problem. Figure the Athlons
using 70+W each for starters.
--
Cheers
John Summerfield

Microsoft's most solid OS: http://www.geocities.com/rcwoolley/
Join the "Linux Support by Small Businesses" list at
http://mail.computerdatasafe.com.au/mailman/listinfo/lssb

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/JjlUgA/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

Dan

2002-11-20 20:40:23 UTC

Post by John Summerfield
btw Don't think you're going to cure I/O problems with a bunch of IDE drives.
You can only drive one at a time on each IDE port. And, if you have a mess of
ribbon in the box, ventilation is going to be a problem. Figure the Athlons
using 70+W each for starters.

Well, there are some options. A bunch of IDE drives in a RAID
configuration will improve things somewhat. Even better (though more
expensive) would be a bunch of SCSI drives with different Linux file
systems, each containing some number of DASD volumes. You could get
up to 15 of them on a single HBA, but there would be more advantage
to using a couple of HBAs and splitting the drives between them.

The PCI bus itself can move 500 megabytes per second (which is the
same as over 100 parallel channels), and twice that if you use a 64
bit PCI HBA. Not bad.

The limitations of the physical package, power supply, cooling
system, etc. catch up with you really quickly in the PC world. I
fully agree that a real computer is necessary if you want to do real
work. I only mean to point out that there are various options, and
that the I/O parallelization consideration is a key hardware
optimization when emulating a mainframe.

The other concern would be having multiple processors, since, in the
emulated mainframe, running the channel program requires CPU cycles.

Regards,
--Dan

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/JjlUgA/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

mvt

2002-11-21 15:06:27 UTC

On Wed, Nov 20, 2002 at 08:40:23PM -0000, Dan wrote:
(snip)

Post by Dan
Well, there are some options. A bunch of IDE drives in a RAID
configuration will improve things somewhat. Even better (though more
expensive) would be a bunch of SCSI drives with different Linux file
systems, each containing some number of DASD volumes. You could get
up to 15 of them on a single HBA, but there would be more advantage
to using a couple of HBAs and splitting the drives between them.

Hi Dan,

My experience over the years leads me to think that one of the most
effective ways to improve I/O performance is to avoid the I/O entirely.

This thread (except Greg's comment) seems to overlook somewhat the
effects of available caching services provided by Linux (and other Unices)
at the filesystem level, and caches implemented by database products
hosted by the guest operating system.

Linux strives to avoid I/O by allocating available memory to the buffer
cache. The cache can be quite large (hundreds of megabytes or more) on
machines which are not memory constrained. The effect is similar to that
of the large 3880/3990 caching controllers of days gone by.

My time spent playing with MVS under Hercules leads me to believe that
the majority of the working set of active data becomes resident in the
buffer pool (except when MVS is hosting a database product which is
managing its own large buffer pools). Activity against active data
tends to result in no physical i/o at all (other than sync'ing write
activity to the brown media). The effectiveness of the cache is
even more profound when running with Greg's DASD compression code.
Even with a single large IDE drive, the apparent (from the MVS point
of view) I/O rates can be astonishingly good.

Unless a database which provides its own buffering is in the mix, I like
to limit the MVS memory to a few hundred megabytes less than physical
memory on the machine. If a database is running under MVS which
manages its own buffer pools, and if that internal database pool is
larger than the operating system provided buffer pool, then cache misses
are guaranteed to happen (thus polluting the cache, wasting cycles, and
degrading performance for other tasks, which is why Oracle and others
lobbied heavily for "raw device" functionality) and all available memory
should be allocated to MVS.

In the real world, running blue hardware, we find that 5 to 10 gb
of DB2 bufferpool is necessary to keep a pair of z900s (14 engines)
busy regardless of the number of paths, etc, etc, that we throw at the
problem. The mileage of others may vary.

So... when considering I/O performance, perhaps it would be wise to
considering throwing a couple of gigs of memory into the mix before
going too far down the SCSI, RAID, etc, road.

--
Reed H. Petty
rhp-VSgbLoB6MtCsTnJN9+***@public.gmane.org

Dan

2002-11-22 06:01:06 UTC

Post by mvt
Hi Dan,
My experience over the years leads me to think that one of the most
effective ways to improve I/O performance is to avoid the I/O

entirely.
That seems to be a very common perception. Please don't misunderstand
me. I'm not saying it's untrue. I just feel that it's looking at the
problem from the "wrong" (subjectively) angle.

If the goal of optimizing I/O performance is to have every I/O
complete as quickly as possible (i.e. minimizing the wait time for
each I/O), then it's absolutely correct to say having the data cached
in RAM makes matters better.

It seems to me that the misconception lies there. Most people
generally believe that the way to do a lot of work with their
computers is to figure out how to make their computers complete each
task as quickly as possible (i.e. they think performance equals
scalability). In my experience, it's much more complicated than that.

Scalability means the system can handle 'n' tasks simultaneously, and
each of those tasks will complete in an acceptable amount of time
from the point of view of the user waiting for it. It's not generally
true that reducing the turnaround time for each task will result in
your system being able to do the most possible work at one time.

In order to maximize the amount of work your system can do at one
time, it's necessary to try to minimize each task's impact on your
system's resources. The fewer resources each task uses, the more
tasks can be going at once with acceptable performance. If you think
about it, you'll see that the resources a task uses do not
necessarily go down as you improve its turnaround time. The more
we "optimize" a task for fast turnaround (beyond simple gains in
efficiency), the more expensive it becomes.

For example, memory is a faster storage medium than DASD, and it's
also a lot more expensive. A program that keeps its data in DASD data
sets is cheaper in system resources than one that keeps it all in
memory, even though that program's turnaround time might be longer.
But, if that turnaround time is still acceptable to the user, it's
better to use the cheaper resources for that program, leaving more
memory free for other things. If every program that runs on your
system uses 10K of memory and keeps all of its data on disk, you can
run 100 programs in one meg of main storage as long as you have
enough disk units and data bandwidth (say 20 drives with 5 programs
sharing each drive).

On the other hand, if your performance is based on having a lot of
cache memory instead of a good I/O system, it's much more expensive
to scale it up. If you need to run a batch job that processes a
million records sequentially, unless you have the main storage to
cache them all, that program will still have to access every single
bit of that data. Making that program go fast on a system with poor
I/O and lots of cache memory is going to be extremely expensive.
Consider the case of having to run 20 programs like that at once.
With 20 disk drives and some decent I/O hardware, they might only
take 100K each in main storage and their I/O will be completely
parallelized, meaning the system can likely do all 20 in about the
same time it would have taken to do one. Without the I/O
parallelization, the more data you start moving, with less locality
of reference, as with large batch runs, the more the poor performance
of the I/O shines through. In short, the system doesn't scale upward.

I definitely don't mean to disdain your observations. I have heard a
lot of people talk about how caching reduces the need for a good I/O
system. I haven't really seen it in practice. It seems to me that a
lot of these kinds of ideas spring from the fact that people tend to
buy a machine and dedicate it to one thing and then throw as much
hardware as they can at doing that one thing as fast as they can do
it. A lot of focus is placed on turnaround times with relatively
little thought to the cost per unit of work, and scalability. It's
counter intuitive to think a slower program is more efficient than a
faster one, but I have found it to be true in many cases. People
trade efficiency for raw speed, which seems to me like robbing Peter
to pay Paul.

--Dan

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

John Summerfield

2002-11-22 06:34:09 UTC

Post by mvt
Hi Dan,
My experience over the years leads me to think that one of the most
effective ways to improve I/O performance is to avoid the I/O

entirely.
That seems to be a very common perception. Please don't misunderstand
me. I'm not saying it's untrue. I just feel that it's looking at the
problem from the "wrong" (subjectively) angle.
If the goal of optimizing I/O performance is to have every I/O
complete as quickly as possible (i.e. minimizing the wait time for
each I/O), then it's absolutely correct to say having the data cached
in RAM makes matters better.

I don't think that's quite what Reed had in mind. The other day I had a system
running with loadaverage > 113 for, as far as I could tell, the day.

The reason? Three jobs each demanding a working set of around 32 Mbytes on a
system with 128. The disk drive was taking considerable punishment.

More RAM, the I/O demands wouldn't have been there and all would have been
well.

Nobody will argue that application data need not be written to disk or tape or
somewhere for storage. Sometimes, though, even that's not necessary because
you're about to reprocess it and it's the reprocessed form you want to keep.
That's where UNIT=VIO was so useful in MVS, and one of the reasons it
performed better than SVS and VS1 on high-end computers, even though it was
more complex.

Post by mvt
On the other hand, if your performance is based on having a lot of
cache memory instead of a good I/O system, it's much more expensive
to scale it up. If you need to run a batch job that processes a
million records sequentially, unless you have the main storage to

That's hardly a sensible example, because the demands of the application most
likely requires it he read and written.

OTOH paging and swapping are excellent examples of I/O that should be (mostly)
eliminated.

Post by mvt
I definitely don't mean to disdain your observations. I have heard a
lot of people talk about how caching reduces the need for a good I/O
system. I

Seems to me that good caching is part of a good I/O system.
--
Cheers
John Summerfield

Microsoft's most solid OS: http://www.geocities.com/rcwoolley/
Join the "Linux Support by Small Businesses" list at
http://mail.computerdatasafe.com.au/mailman/listinfo/lssb

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

Dan

2002-11-22 20:12:15 UTC

Post by John Summerfield
I don't think that's quite what Reed had in mind. The other day I had a system
running with loadaverage > 113 for, as far as I could tell, the day.
The reason? Three jobs each demanding a working set of around 32 Mbytes on a
system with 128. The disk drive was taking considerable punishment.
More RAM, the I/O demands wouldn't have been there and all would have been
well.

Agreed. If applications want more memory than the system can deliver,
the system starts paging and performance suffers. Having more memory
to avoid the paging fixes that problem.

By design, virtual memory systems must have enough main storage for
the common working set of all of the tasks they are performing. If
not, thrashing occurs. MVS swapping is designed to help by only
dispatching the number of tasks that it has enough main storage to
accommodate. The others are swapped out.

Your example makes a good one to illustrate what I am talking about.
If we are running a system with a flimsy I/O subsystem and relying on
the fact that we can cache data in main storage, that assumption is
violated as soon as we have enough work so that all of our main
storage is needed. Even worse, if the caching is done by the
application, its working set goes way up, which dramatically reduces
the number of applications we can have running at one time before
thrashing occurs. That application caching actually reduces the
scalability of the machine. To fix that problem, we have to spend
more money on main storage than we would have spent having a decent
number of DASD units to share the load. The problem occurs when the
application is designed such that it won't use our DASD even if we
want it to.

VIO is great because it lets the system programmer choose
intelligently. If we know that this particular job is going to
produce a small amount of data which must then be processed by 10
more job steps, each of which will only run for a short time, we can
get a much faster turnaround time by trading a bit of main storage,
assuming we are running at a time when there is enough main storage
free to prevent a lot of paging. On the other hand, somebody else
might run that same program on a set of inputs that produces a large
result set that isn't going to be accessed much, or as a low-priority
background task when turnaround time is not really that important,
and it would make more sense to store the data on DASD, or even on
tape.

Intelligent tradeoffs are good. A system or application design that
makes the trade offs for you is typically making broad assumptions
about how you want to apply your hardware resources. Unfortunately,
there are quite a few caching schemes that fall into the latter
category.

It does make good sense to use memory that is not otherwise needed
for caching in order to avoid unnecessary I/O. My point is that it is
no substitute for a good I/O system. In a large scale system, it
makes more sense to dedicate your main storage to running
applications, and provide them with adequate channels to the data
they need stored on cheaper media. The reason is it's a lot less
costly. Conversely, it allows you to do more with the same hardware
investment.

--Dan

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

Greg Smith

2002-11-22 07:26:07 UTC

Post by mvt
Hi Dan,
My experience over the years leads me to think that one of the most
effective ways to improve I/O performance is to avoid the I/O entirely.

That seems to be a very common perception.

<snip>

Post by Dan
If the goal of optimizing I/O performance is to have every I/O
complete as quickly as possible (i.e. minimizing the wait time for
each I/O), then it's absolutely correct to say having the data cached
in RAM makes matters better.
It seems to me that the misconception lies there. Most people
generally believe that the way to do a lot of work with their
computers is to figure out how to make their computers complete each
task as quickly as possible

<nother snip>

Post by Dan
In order to maximize the amount of work your system can do at one
time, it's necessary to try to minimize each task's impact on your
system's resources.

<more snip>

Post by Dan
For example, memory is a faster storage medium than DASD, and it's
also a lot more expensive.

<etc>

Post by Dan
On the other hand, if your performance is based on having a lot of
cache memory instead of a good I/O system, it's much more expensive
to scale it up.

<last one>

Post by Dan
People
trade efficiency for raw speed, which seems to me like robbing Peter
to pay Paul.

Don't get me wrong, but are you seriously arguing that having a cache
between disparate storage media is due to lazy programming ?? Why
do processors have a l1 and l2 cache ?? and disk control units a cache ??
and applications, such as hercules ?? I think even z/os tries to page
stuff using an lru-type algorithm.

I'm a bottom-up type of programmer. If my choice is coding read()/
write() or performing a search on some in-storage array that might
already have my data, then I'll burn the cpu to search the array as long
as the ratio of disk access time vs cpu time is great enough. I bought
1G memory for my 3 yr old dual piii 850mhz machine a while ago for
130usd. I don't consider that *that* expensive.

You are right in the sense that cache shouldn't be blindly applied to
solve a problem. But, it seems, you are making judgement calls against
code that you admittedly haven't even looked at. I don't blindly make
coding decisions. I take measurements, I trace the code, I examine the
assembler. In some complicated tasks, like garbage collection, my
intuition as to what should work best is shown wrong.

If you are serious that caching may be misapplied in hercules code
then please cite some examples.

Remember, hercules can run on, eg, linux-390. I can define my emulated
disks to be on a raid0 filesystem that spans multiple volumes across
multiple controllers and chpids. Or everything can be on a `lousy' ide
controller on my pc, which gets, btw, about 20MB/s.

Thanks,

Greg

Greg Smith

2002-11-22 08:11:06 UTC

I _had_ to go off-list.

Post by Greg Smith
controller on my pc, which gets, btw, about 20MB/s.

Hmm. How do you test that? I'll admit this isn't a _good_ benchmark because of
the way modern drives are made, but I do think it's a reasonable indicator.

[***@gw root]# hdparm -t /dev/hd{a,b}

/dev/hda:
Timing buffered disk reads: 64 MB in 2.37 seconds = 27.00 MB/sec

/dev/hdb:
Timing buffered disk reads: 64 MB in 2.30 seconds = 27.83 MB/sec
[***@gw root]# cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 3
model name : Pentium II (Klamath)
stepping : 4
cpu MHz : 233.869
cache size : 512 KB
<snip>
[***@gw root]#

Your P III system _should_ do a little better. These are new drives - less
than six months old.
--
Cheers
John Summerfield

Microsoft's most solid OS: http://www.geocities.com/rcwoolley/
Join the "Linux Support by Small Businesses" list at
http://mail.computerdatasafe.com.au/mailman/listinfo/lssb

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

John Summerfield

2002-11-22 08:14:20 UTC

Post by Greg Smith
I _had_ to go off-list.

Oh damn. I did some address snipping and substitutions and snipped and
subtitured the wrong fields.

It was intended _for_ Greg alone, and not to seem to come _from_ him.

--
Cheers
John Summerfield

Microsoft's most solid OS: http://www.geocities.com/rcwoolley/
Join the "Linux Support by Small Businesses" list at
http://mail.computerdatasafe.com.au/mailman/listinfo/lssb

floyds_void

2002-11-22 15:14:38 UTC

Post by Greg Smith
I _had_ to go off-list.

Post by Greg Smith
controller on my pc, which gets, btw, about 20MB/s.

Hmm. How do you test that? I'll admit this isn't a _good_ benchmark because of
the way modern drives are made, but I do think it's a reasonable indicator.

I just pulled a number off the top of my head that's fairly
impressive but not an overstatement. xosview sometimes shows
40MB/s sustained on my 7200 rpm drive when copying a large file
to somewhere else.

Greg

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

Dan

2002-11-22 22:06:39 UTC

Post by Greg Smith
Don't get me wrong, but are you seriously arguing that having a cache
between disparate storage media is due to lazy programming ?? Why
do processors have a l1 and l2 cache ?? and disk control units a cache ??

No, that's not what I'm arguing. I think we are looking at this from
different angles.

It's a well-known system design principle that caching improves the
parallelism achievable in a system by helping to decouple
asynchronous subsystems with different price/performance tradeoffs.
You're talking about something that is fundamental to the design of
virtually all nontrivial computing systems.

The system gets probabalistic efficiency gains due to the fact that
references tend to be localized, and also due to the fact that
hardware accesses often involve multiple physical actions that must
be performed, the inertia of the actual mechanisms involved, and so
forth. These are all system design considerations. In fact, the
mainframe seems to be about the only system left that still tries to
optimize the physical hardware work (by reducing actuator motions
that must be carried out, etc.) Most other systems have "evolved" to
see the hardware as an abstract concept.

But I digress. My points are:

1. Caching does not obviate the need for I/O performance. It doesn't
even really reduce the need for I/O performance. The gains made by
caching can't generally be made by gains in I/O performance, and vice
versa. Caching is an important system consideration that is more or
less orthogonal to I/O performance.

2. Caching is a price/performance tradeoff that has a point of
diminishing returns. At some point, adding more cache costs more than
it provides. That suggests that there is a "right" amount of cache in
a system, beyond which it is a waste of resources to add more.
Finding the "right" point is a very complex and difficult task, which
is why there are system programmers who specialize in performance
management.

3. Caching is a system problem, not an application problem.
Applications should not do elaborate caching (beyond basic
buffering). Mainframe applications should minimize their impact to
system resources by using as little main storage as possible,
generally for their state data which have a tight locality of
reference, and keep the data they operate on in data sets. System
programmers or other users can decide whether those data sets should
reside on tape, DASD, hiperspace, main storage (VIO), etc.

I beleive that this discussion began with somebody saying that a good
way to improve I/O performance is to have a lot of cache RAM so as to
avoid doing any I/O at all. My response was meant to be something to
the effect of this: That's not really a good way to improve I/O
performance. It is simply a way to use more expensive hardware that
is faster instead of using cheaper hardware that is slower.

In other words, given a choice between a PC system with 256 MB of RAM
and 15 4 GB SCSI drives, and a PC system with 8 GB RAM and a single
60 GB IDE drive, I think the former would be capable of much greater
mainframe workloads with Hercules, considering that most commercial
workloads have a comparatively low reference locality and tend to be
I/O bound. I think the 15-fold increase in I/O parallelism buys more
scalability than the 16-fold increase in RAM.

When I say "greater workloads", I am talking about in a scenario
where the machine is doing many things at once.

To illustrate:

Imagine a job that processes 1 GB of data stored in a dataset, and
stores the 1 GB result in another dataset. Both datasets are
permanent. The job processes the records sequentially.

No matter how much caching the system can do, it is still necessary
to read 1 GB of data from DASD and ultimately to write 1 GB of data
back to DASD. The latter might happen in a "lazy writeback" system,
but it must still be done in order to ensure the data's consistency
if the system should suddenly crash or lose power. If you only ever
ran this one job on the system, the second time you ran it you would
avoid the need to read the data, but not the need to write the data.

Run the job cold on both of those systems. It may complete sooner on
the one with lots of memory, but it's not really complete because the
system still must write back all of the cached data to disk.
Ultimately, the systems perform somewhere pretty close to equally
with that job.

Now imagine you have 15 jobs like that. Each of them reads a
*different* 1 GB of data, and each of them produces a separate
dataset with 1 GB of data. Running in isolation, no job runs at
greater than 5% CPU utilization.

Run all 15 jobs at once on the system with 1 disk drive. The system
must allocate 30 datasets on the same drive (maybe different volumes,
but the same physical drive). Running 15 jobs at once has reduced the
amount of memory available for caching as well. The system must still
read in 15 GB of data, but it must all come from the one drive. The
single drive bottleneck means the CPU cannot be utilized to its full
potential (which should be 75% in this case). Caching cannot improve
this, because all of that data must be brought in from DASD before it
is in the cache, and we are only going to read it once. Once all of
the jobs finish generating their result sets, 15 GB of data must be
written back to the single drive. Caching can't improve that either,
it can only delay it. Even though the system has 16 times as much
memory available, it will still take at least 15 times as long to
complete all 15 jobs as it would have to complete one due to the fact
that they are all sharing a single drive. Matters are likely made
worse by the fact that the drive actuator is thrashing more.

Run them on the system with 15 drives, with a different drive per
job. Each drive has two datasets allocated to it. On this system, CPU
utilization goes right up to 75%, and each job utilizes a single
drive to the same extent that it would have if it were the only job
running on the system. All 15 jobs complete in the same amount of
time that it would have taken to complete only one of them.

But what if you preloaded all of the data and locked it in memory
(e.g. in hiperspaces or somesuch)? What if you used VIO for the
output data sets instead of DASD. Well, then you're just using more
expensive storage to do the job. Yes, any single job will run faster
if you throw more money at it. But if your goal is to put together a
multi-purpose system with the idea that it should be able to do as
much work as possible for your money (i.e. "bang for the buck"), a
high-performance DASD subsystem is a whole lot more cost effective
than a bunch of RAM. Also consider that a regular PC can't really
address 8 GB of RAM. It runs into a limitation at 4 GB that requires
special hardware to surpasss, costing even more.

I tend to view it like this: If I have a single job that is I/O
bound, and it completes in an acceptable amount of time using DASD
I/O, then I can run some large number of those jobs in parallel on
the same system as long as I have enough drives. Each will still
complete in about the same time it would have if nothing else were
happening on the system. As long as that number is acceptable, the
system is scalable in a way that is much more deterministic (i.e.
guaranteed) than trying to throw a lot of RAM at the problem. It's
not a question of trying to get one job to run as quickly as
possible. It's a question of trying to get the most possible work out
of the system.

Post by Greg Smith
I'm a bottom-up type of programmer. If my choice is coding read()/
write() or performing a search on some in-storage array that might
already have my data, then I'll burn the cpu to search the array as long
as the ratio of disk access time vs cpu time is great enough.

So would I. But the choice of whether to have all of the data in an
array in the first place is a higher level design decision. When
processing data of some arbitrary size, do I dynamically allocate a
big buffer, pull it in from disk, and then do a bunch of work on it
in memory, or do I seek around the data on disk and do the work on
small chunks of it brought into fixed sized buffers? The former
trades machine resources to get speed. It will execute faster, and it
will take more memory. If the data set is truly arbitrarily sized,
then that makes it much worse because the memory usage of the program
is open-ended, meaning its worst case usage cannot be predicted at
design time. Neither choice is always right, but it should be
considered seriously at design time with an eye to the tradeoffs
involved. If the latter approach allows the program to complete in an
acceptable period of time, then it is probably a much better approach
since it makes more efficient use of machine resources.

Post by Greg Smith
I bought
1G memory for my 3 yr old dual piii 850mhz machine a while ago for
130usd. I don't consider that *that* expensive.

Memory is expensive in many ways. Its cost per byte is still much
greater than disk space. Then there is the fact that the system can
only address a small, finite quantity of it (4 GB). If you want to
use the machine to process more than 4 GB of data at once, some of
that data will have to be in some other storage medium. At that
point, it is a good idea to keep the more important things in memory
and the less important things on disk. If you already have that
discipline in your application programming, then you already have a
system that scales up much bigger. Then there is the locality of
reference issue. Accessing a larger amount of memory at once results
in more cache misses, which dramatically slows the processor's
instruction rate. Cache misses are synchronous hits to CPU execution
(meaning they have to be considered a cost in terms of CPU cycles),
while I/O is always asynchronous. There is the allocator overhead.
Since RAM costs more per byte, it is desirable to use sophisticated
schemes to reduce or eliminate slack space. Those allocation
algorithms tend to have a much greater cost in CPU cycles than DASD
storage management, since it is acceptable to waste more of the
latter in order to reduce CPU usage.

There is also the important point that using a lot of memory does not
change the fact that the data must end up on disk anyway in order to
be in a permanent form, so there is some I/O involved even if you
wanted to have it all in memory all the time.

In general, system designs have evolved along a line of counting main
storage as a relatively small, finite, temporary, relatively
expensive storage medium, with a hierarchy of cheaper and more
permanent, but slower, storage mediums beyond it.

I think in many places there has been a trend toward programs (and
even system designs) that consider memory to be cheap and disdain I/O
as being expensive, and I think this trend has had a negative impact
on the overall efficiency, cost, and scalability of our systems.

Post by Greg Smith
You are right in the sense that cache shouldn't be blindly applied to
solve a problem. But, it seems, you are making judgement calls

against

Post by Greg Smith
code that you admittedly haven't even looked at.

Fair enough. But I didn't think we were talking about a specific
piece of code. This discussion began when I observed that I/O system
performance is important to the performance of an emulated mainframe,
and somebody suggested that perhaps having a lot of RAM would be a
better use of your money (when putting together a Hercules system)
than SCSI drives, etc. I only meant to say I disagree with that
statement.

Post by Greg Smith
I don't blindly make
coding decisions. I take measurements, I trace the code, I examine the
assembler. In some complicated tasks, like garbage collection, my
intuition as to what should work best is shown wrong.

I don't know you very well, but just from talking with you I would
tend to assume you are careful and astute. I never meant to suggest
otherwise.

I've seen a lot of people put long hours into profiling and tweaking
something so that its execution time in a vacuum is as short as
possible. I think that's the wrong thing to be profiling and
optimizing in the first place. There is a tradeoff between turnaround
time and resource usage that should be worked until the code in
question uses the least possible system resources it can use for an
acceptable turnaround time in real-world usage. That's a lot more
complicated of a problem than getting it to go as fast as possible in
isolation, but I suggest it is "the stuff" of performance management.

Post by Greg Smith
If you are serious that caching may be misapplied in hercules code
then please cite some examples.

I never meant to suggest that. I was trying to say that if I were to
advise where to put your money into an emulated mainframe system to
get good performance, I'd spend more on the I/O and disk drives than
on the memory. That's for any emulated mainframe, whether Hercules or
FLEX-ES. I haven't looked at the Hercules code, but it seems to work
quite well for the limited amount of stuff I've done with it so far.

Post by Greg Smith
Remember, hercules can run on, eg, linux-390. I can define my

emulated

Post by Greg Smith
disks to be on a raid0 filesystem that spans multiple volumes across
multiple controllers and chpids. Or everything can be on a `lousy' ide
controller on my pc, which gets, btw, about 20MB/s.

True. Even better, you can define them to be on individual disks. I
am going with IDE RAID for my Hercules box, but I think you'd get
better performance going SCSI with a bunch of smaller drives (say 4-8
GB), and splitting your DASD between them. RAID is a case of taking a
bunch of slow, parallel things and converting them to a single fast,
serial thing. I think they are more advantageous as slow, parallel
things. For example, every drive in the array must seek on every
access in a RAID system. If you split DASD between the drives, a
single program can process data sequentially on a single drive
without a seek between each read or write, and without affecting the
performance of other programs at all. Also, disk units nowadays have
caches and read-ahead logic that works much bettern when each disk is
dedicated to a small number of tasks.

One of my favorite analogies is the laundry. If you were designing a
public laundry facility that could handle 6 customers per hour, would
it be better to have one washer that completes a load in 10 minutes,
or 6 washers, each of which can complete a load in an hour, assuming
the cost is the same either way?

--Dan

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

Jay Maynard

2002-11-23 14:40:13 UTC

Post by Dan
In other words, given a choice between a PC system with 256 MB of RAM
and 15 4 GB SCSI drives, and a PC system with 8 GB RAM and a single
60 GB IDE drive, I think the former would be capable of much greater
mainframe workloads with Hercules, considering that most commercial
workloads have a comparatively low reference locality and tend to be
I/O bound. I think the 15-fold increase in I/O parallelism buys more
scalability than the 16-fold increase in RAM.

This would be my opinion as well, though I haven't benchmarked it formally.
I will note that my dual PIII-550 with 4 18 GB SCSI drives and hardware RAID
flies compared to my other systems.

RAID also allows greater parallelization even when only one task is running.

Post by Dan
I think in many places there has been a trend toward programs (and
even system designs) that consider memory to be cheap and disdain I/O
as being expensive, and I think this trend has had a negative impact
on the overall efficiency, cost, and scalability of our systems.

Yes. That way lies Windows.

Post by Dan
Fair enough. But I didn't think we were talking about a specific
piece of code. This discussion began when I observed that I/O system
performance is important to the performance of an emulated mainframe,
and somebody suggested that perhaps having a lot of RAM would be a
better use of your money (when putting together a Hercules system)
than SCSI drives, etc. I only meant to say I disagree with that
statement.

I will disagree with it as well. For my money, you only need enough RAM to
provide real storage for every byte defined to the Hercules system, both
main and expanded storage, plus enough additional to cover the needs of the
host OS and Hercules itself. A 370-mode user will notice no difference in
performance between 256 MB and 1 GB of RAM in the host system, if Hercules
is all he's running. (It would give him the ability to run more than just
Hercules while seeing no performance degradation.)

That said, memory is cheap enough these days that getting what seems a truly
huge amount is worth doing just to improve everything else, and to provide a
cushion against further growth in resource usage. You can never have too
much real storage in a virtual memory OS.

(Aside: One thing the FSF has in common with Microsoft is an explicit
disdain for making programs efficient; they both hold that machines are
getting bigger and faster quickly enough that they don't need to waste
effort on making programs run well on smaller, slower systems.)

Post by Dan
True. Even better, you can define them to be on individual disks. I
am going with IDE RAID for my Hercules box, but I think you'd get
better performance going SCSI with a bunch of smaller drives (say 4-8
GB), and splitting your DASD between them. RAID is a case of taking a
bunch of slow, parallel things and converting them to a single fast,
serial thing. I think they are more advantageous as slow, parallel
things. For example, every drive in the array must seek on every
access in a RAID system. If you split DASD between the drives, a
single program can process data sequentially on a single drive
without a seek between each read or write, and without affecting the
performance of other programs at all. Also, disk units nowadays have
caches and read-ahead logic that works much bettern when each disk is
dedicated to a small number of tasks.

This is quite true in the specific case of Hercules, or other applications
where you can deterministically split data access across different devices.
Hercules DASD emulation is one such case, of course. (At that point, you
wind up doing the same kind of performance tuning that you do on mainframe
systems with real (non-emulated, such as 3390 as opposed to a Shark) DASD.)
However, when you cannot determine exactly what I/Os will be directed to
what device, or where, you lose the ability to do this kind of tuning.

Not all RAID setups require every drive to seek to perform an I/O. If the
I/O is of a size smaller than the RAID stripe size on a RAID 0 (or RAID 0+1)
array, for example, only the specific drive where the data resides needs to
seek. On my 4-drive RAID 0 array, that means that more I/Os can be
overlapped, as another I/O can be issued while one drive is seeking. (Sound
familiar?) RAID 5 generally does require all drives to seek, however; this
is part of the performance tradeoff when selecting how to set up one's RAID
array.

The overhead of RAID is also affected by what's doing the array management.
Hardware RAID offloads all of that to the controller, which can do the
overhead tasks in the background, thus freeing up the host CPU even more
than SCSI does. (I have no experience with hardware IDE RAID, only SCSI, but
I would expect that the IDE RAID controller would provide the same
decoupling of host CPU activity from disk I/O that SCSI does intrinsically.)
The controller can also do predictive reading and cacheing transparently to
the host, thus providing the benefits with none (or little) of the cost.

Dan

2002-11-25 19:40:06 UTC

Post by Jay Maynard
(Aside: One thing the FSF has in common with Microsoft is an

explicit

Post by Jay Maynard
disdain for making programs efficient; they both hold that machines are
getting bigger and faster quickly enough that they don't need to waste
effort on making programs run well on smaller, slower systems.)

Interesting. I always thought it intuitively seemed that way. If I
recall correctly, FSF set out to improve on UNIX (Gnu's Not Unix) by
adding additional functionality. It seemed to me that it's not always
a correct assumption that more functionality is an improvement.

In the early days of Linux, it was really touted as not having the
bloat of Windows, being lean and mean, and so forth. Having just
installed RedHat 7.3, I would say it doesn't seem to work that way
any more.

It seems unfortunate that the Linux community took the turn of trying
to make it a desktop client operating system in order to try to take
market share from Microsoft. It's ironic to me, since I've always
held that Microsoft's tragic flaw was to think that one operating
system could be optimal for all purposes. They started with MS-DOS
and tried to evolve it into a big-iron server OS without losing its
simplicity and accessability to the masses. The Linux community
started with UNIX and tried to evolve it into a user-friendly,
graphical, single-user system without losing its power and
scalability. Either involves very difficult tradeoffs, and likely
proves impossible in the end.

Post by Jay Maynard
Not all RAID setups require every drive to seek to perform an I/O. If the
I/O is of a size smaller than the RAID stripe size on a RAID 0 (or RAID 0+1)
array, for example, only the specific drive where the data resides needs to
seek. On my 4-drive RAID 0 array, that means that more I/Os can be
overlapped, as another I/O can be issued while one drive is

seeking. (Sound

Post by Jay Maynard
familiar?)

It's funny you mention it. Over the weekend I was putting together a
Linux box to run Hercules. After thinking about it, I decided to go
RAID 0 with 64K blocks for that specific reason. With 64K blocks, a
large percentage of I/O operations can be satisfied out of a single
block, so up to 4 of them can run in parallel on my system.

While thinking this through, I also considered the option of using
the 4 drives as separate volumes, but decided that RAID 0 would be
better for 4 drives.

There seem to be two orthgonal needs to achieve I/O efficiency. One
is to ensure that different programs operating at the same time are
accessing different hardware. The other is to try to achieve hardware
affinity, where each program's operations are carried out against a
small set of devices. The latter improves caching at the device,
reduces actuator motions, etc.

Having four independent volumes with DASD units scattered across them
improves affinity, because each data set resides on one physical
drive. But it makes it more difficult to ensure parallelism, because
it is quite likely if there are 4 programs running on the system that
they will end up accessing 4 data sets on the same volume.

After careful consideration, I decided that, on a small system with a
limited number of drives that is used for a lot of purposes, RAID 0
is a better way to go. On a larger system with lots of physical
drives, or a system that is only used for a few purposes, there would
be more to be gained by carefully tuning the system, distributing
datasets across physical volumes, and so forth.

A lot of midrange boxes have the hybrid approach of multiple disk
arrays, so you can do both things at once. You can have multiple
parallel disk drives in an array with some set of more looslely
related data sets. This is how the AS/400 architecture is designed.
You can have as many disk drives as you want, and you divide them
into "Auxilliary Storage Pools" (ASPs), each of which is
independently striped. But AS/400 takes it a step further by
collecting statistics on block references and redistributing blocks
across physical drives for optimal parallelism within an ASP. This is
probably a "best of all worlds" way to do things in the midrange.
When you get into larger mainframes, it probably becomes more
advantageous "hand optimizing" with individual disk units.

Post by Jay Maynard
The overhead of RAID is also affected by what's doing the array management.
Hardware RAID offloads all of that to the controller, which can do the
overhead tasks in the background, thus freeing up the host CPU even more
than SCSI does. (I have no experience with hardware IDE RAID, only SCSI, but
I would expect that the IDE RAID controller would provide the same
decoupling of host CPU activity from disk I/O that SCSI does

intrinsically.)

I have a Promise SX-4000 ATA RAID controller, and it seems to do a
pretty good job of this. It has 256MB cache on the card, and an ASIC
for array management. Its main drawback is that it is very slow with
XOR calculations compared to something with an i960. That makes write
performance much slower than read performance in a RAID 5
configuration. It doesn't have this limitation in RAID 0. Great
little card for the money.

--Dan

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

s***@public.gmane.org

2002-11-25 20:47:22 UTC

Post by Dan
In the early days of Linux, it was really touted as not having the
bloat of Windows, being lean and mean, and so forth. Having just
installed RedHat 7.3, I would say it doesn't seem to work that way
any more.

Install RHL 7.3 and XP on the same system. Assume each takes 2 Gbytes of
disk (my standard install of RHL does) Which allows you to do more?

How _small_ a useful XP system can you create?

If you want the functionality there's a price to pay.

Post by Dan
It seems unfortunate that the Linux community took the turn of trying
to make it a desktop client operating system in order to try to take
market share from Microsoft. It's ironic to me, since I've always

You can go back to RHL 5.0 if you like. I much prefer 7.3.

I think this is enough advocacy.
--
Cheers
John.

Join the "Linux Support by Small Businesses" list at
http://mail.computerdatasafe.com.au/mailman/listinfo/lssb

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

Willem Konynenberg

2002-11-22 08:05:58 UTC

Post by Dan
It seems to me that the misconception lies there. Most people
generally believe that the way to do a lot of work with their
computers is to figure out how to make their computers complete each
task as quickly as possible (i.e. they think performance equals
scalability). In my experience, it's much more complicated than that.

[etc]

I read a lot of statements that are, in principle, generally true,
but I don't see what point you are trying to make, and I don't
see where the fundamental "misconception" is.

Basically, what you say is that to get really good I/O performance,
you need to buy really good I/O hardware. A single cheap IDE disk
in your PC isn't going to be enough. But a decent multi-channel
SCSI host adapter with a bunch of SCSI disks in a simple hot-swap
cabinet, used as RAID5 using Linux software RAID, will already
do a lot better. Add in a better motherboard for 500 MB/s
aggregate I/O throughput and things get still better. Etc, etc.

Given that Hercules CPU performance is probably something like
a factor of 100 behind a modern S/390, I'ld guess that the PC I/O
capacity available to Hercules (500 MB/s, 100s of I/Os / s) is,
by comparison, more than adequate.
Heck, even the I/O capacity of a single IDE disk (5 to 20 MB/s
effective, 100 I/Os / s) may well be roughly within the range
of 1/100th of an S/390.
The disk caching function of Hercules, optionally combined with
the inherent adaptive disk caching functionality of Linux, may
further enhance I/O performance as perceived from the mainframe
system, just like a caching DASD controller does, but it doesn't
change the fundamental hardware I/O capacity. Nothing wrong
with that. If you knew that with the real thing, you know it
with Hercules.

So, although your statements are basically true, I would like
to know what exactly the point is you are trying to make.

--
Willem Konynenberg <w.f.konynenberg-/NLkJaSkS4VmR6Xm/***@public.gmane.org>
Konynenberg Software Engineering

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

John Summerfield

2002-11-22 08:22:00 UTC

Post by Willem Konynenberg
Heck, even the I/O capacity of a single IDE disk (5 to 20 MB/s

As I accidently sent to the list, I get 27 Mbytes/sec of the surface on part
of two drives in a 5-6 years old Pentium II system. The faster drive is 5400
RPM - don't believe that more RPM is necessary better.

I also have one of IBM's Deathstar drives here. In that system I got maybe 20
Mbytes/sec whereas in an Athlon 1.4 system It gives
[***@numbat root]# sync;hdparm -t /dev/hda

/dev/hda:
Timing buffered disk reads: 64 MB in 1.83 seconds = 35.01 MB/sec
[***@numbat root]#
--
Cheers
John Summerfield

Microsoft's most solid OS: http://www.geocities.com/rcwoolley/
Join the "Linux Support by Small Businesses" list at
http://mail.computerdatasafe.com.au/mailman/listinfo/lssb

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

Willem Konynenberg

2002-11-22 08:57:10 UTC

Post by John Summerfield

Post by Willem Konynenberg
Heck, even the I/O capacity of a single IDE disk (5 to 20 MB/s

As I accidently sent to the list, I get 27 Mbytes/sec of the surface on part
of two drives in a 5-6 years old Pentium II system.

That's why I wrote "effective" on the next line...
My IDE disk will cheerfully give me over 35 MB/s when I ask it,
but that is sequential I/O, not random file I/O.
You need to add in the factor of average access time.

--
Willem Konynenberg <w.f.konynenberg-/NLkJaSkS4VmR6Xm/***@public.gmane.org>
Konynenberg Software Engineering

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

John Summerfield

2002-11-22 09:07:28 UTC

Post by Willem Konynenberg
That's why I wrote "effective" on the next line...
My IDE disk will cheerfully give me over 35 MB/s when I ask it,
but that is sequential I/O, not random file I/O.
You need to add in the factor of average access time.

Yeah, but I'd bet most of the I/O on my system's sequential. Open a document,
read it all. Open a package to install, read it all. Create a document, write
it all.

Database work's different, but I don't do much.

Processing several files together can interrupt things too, but then the cache
should fix that, and presumably the Linux kernel orders the I/O operations
when it flushes cache.
--
Cheers
John Summerfield

Microsoft's most solid OS: http://www.geocities.com/rcwoolley/
Join the "Linux Support by Small Businesses" list at
http://mail.computerdatasafe.com.au/mailman/listinfo/lssb

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

Dan

2002-11-22 22:46:23 UTC

Post by Willem Konynenberg
I read a lot of statements that are, in principle, generally true,
but I don't see what point you are trying to make, and I don't
see where the fundamental "misconception" is.

(snip)

Post by Willem Konynenberg
So, although your statements are basically true, I would like
to know what exactly the point is you are trying to make.

No problem. Here's the point:

We had a discussion that went like this:

Somebody asked about hints for performance on Hercules systems
(hardware choices, etc.)

I said that I would go in favor of a decent DASD I/O setup instead of
spending a lot of money on, say, main storage.

Somebody (mvt) suggested that perhaps another way to improve the
capacity of the system would be to buy a lot of memory instead of a
good I/O subsystem, and devote the extra memory to caching.

The misconception is that adding cache will increase the capacity of
your system in the same way that better I/O performance will. For all
of the reasons previously stated, cache will make certain things more
efficient in a system that tends to refer to the same data over and
over, but it does not obviate the need to read and write the data,
and its effectiveness goes down as you have more and more
simultaneous tasks running on the system due to the fact that system
locality of reference decreases with each additional task.

Post by Willem Konynenberg
From your post, it seems clear that you don't hold this

misconception. But then I never accused you of holding it.

I was merely expressing a response to the opinion that buying a lot
of memory is a better bet for performance than buying good DASD
hardware (SCSI, lots of disk arms, etc.) Though I disagree with that
opinion, I also find it is a *very* common misconception.

On my single proc PC running Hercules under Win2K, I'm getting about
10 MIPS. Our company used to lease a P/390 with 7 MIPS that had
probably 20 times the I/O capacity of this PC. That suggests to me
that perhaps the scales are not as proportional as we may think. 10
MIPS should be able to be shared by 100 users no problem. If I tried
to do that on my PC (with, say, TSO users), I would run into the I/O
bottleneck fairly quickly. The P/390 didn't have that limitation.
That says to me that, to get the most work out of your Hercules
system, it is necessary to put some money into the I/O.

--Dan

halfmeg

2002-11-22 23:58:28 UTC

Post by Greg Smith
<snip>
On my single proc PC running Hercules under Win2K, I'm getting
about 10 MIPS. Our company used to lease a P/390 with 7 MIPS that
had probably 20 times the I/O capacity of this PC. That suggests to
me that perhaps the scales are not as proportional as we may think.
10 MIPS should be able to be shared by 100 users no problem. If I
tried to do that on my PC (with, say, TSO users), I would run into
the I/O bottleneck fairly quickly. The P/390 didn't have that
limitation. That says to me that, to get the most work out of your
Hercules system, it is necessary to put some money into the I/O.
--Dan

Hmm, we have stayed away from this thread due to several reasons, but
now feel we must comment.

P/390 is a two processor complex. The 390 processor may issue an I/O
but the i586 processor is the one which accomplishes the actual I/O.
They also have separate memory (some models can use the host box
memory for extended(?) memory IIRC).
A better comparision is a 2 CPU Hercules host with only 1 CPU
assigned in the Hercules config file.

Tests on several previous versions of Hercules ( single processor
hosts ) regarding processing throughput ( CPU & I/O bound loads)
revealed the priority of the threads had a significant impact on
response times when multiple tasks were running. A large job
printing from the spool would impact TSO response time and system
console acceptance of commands. When the print was purged TSO
response was back to normal ( 1 second or so ). This behaviour is
not noticed in 2+ CPU hosts.

A host processor with a large L2 cache ( 2MB ) outperforms the same
speed processor with a minimal L2 cache ( 256KB ). We think as some
have pointed out the working set of the emulator is fully contained
in a larger cache and doesn't cause degradation of the emulation due
to missed cache requests.

Data on the other hand must flow through the cache and into either
the buffers of the requesting job or be cached in host controlled
main memory not emulated main memory ( one of our current systems has
a 8GB limit & 4 CPUs, one of the reasons we bought it BTW ). The
transfer rate of current IDE drives (30 MB + ) is greater than
channel attached DASD ( 4.5 MB ) or Escon attached DASD ( 17 MB ).
The addition of RAID ( IDE or SCSI ) would perhaps be better still.

A friend and I were talking one night about why the mainframe
outperformed PCs. After a few brews, I spouted out that it had to be
the access I/O devices had to memory, not necessarily their speed.
After all, where else could you have 16 separate I/Os and multiple
CPU access to memory all at the same time. PCs may have DMA but
doesn't it lock everybody else out of the pool while the line is high?

Phil - again we disavow any and all knowledge of anything stated
above which may give a hint that we have a inkling about any of it

John Bellomy

2002-11-23 04:11:55 UTC

Is there a Hercules that runs on OS/2 Warp V4 or V4-SMP?
-----Original Message-----
From: halfmeg [mailto:opplr-***@public.gmane.org]
Sent: Friday, November 22, 2002 3:58 PM
To: hercules-390-***@public.gmane.org
Subject: [hercules-390] Re: Performance observations please

Post by Greg Smith
<snip>
On my single proc PC running Hercules under Win2K, I'm getting
about 10 MIPS. Our company used to lease a P/390 with 7 MIPS that
had probably 20 times the I/O capacity of this PC. That suggests to
me that perhaps the scales are not as proportional as we may think.
10 MIPS should be able to be shared by 100 users no problem. If I
tried to do that on my PC (with, say, TSO users), I would run into
the I/O bottleneck fairly quickly. The P/390 didn't have that
limitation. That says to me that, to get the most work out of your
Hercules system, it is necessary to put some money into the I/O.
--Dan

Hmm, we have stayed away from this thread due to several reasons, but
now feel we must comment.

P/390 is a two processor complex. The 390 processor may issue an I/O
but the i586 processor is the one which accomplishes the actual I/O.
They also have separate memory (some models can use the host box
memory for extended(?) memory IIRC).
A better comparision is a 2 CPU Hercules host with only 1 CPU
assigned in the Hercules config file.

Tests on several previous versions of Hercules ( single processor
hosts ) regarding processing throughput ( CPU & I/O bound loads)
revealed the priority of the threads had a significant impact on
response times when multiple tasks were running. A large job
printing from the spool would impact TSO response time and system
console acceptance of commands. When the print was purged TSO
response was back to normal ( 1 second or so ). This behaviour is
not noticed in 2+ CPU hosts.

A host processor with a large L2 cache ( 2MB ) outperforms the same
speed processor with a minimal L2 cache ( 256KB ). We think as some
have pointed out the working set of the emulator is fully contained
in a larger cache and doesn't cause degradation of the emulation due
to missed cache requests.

Data on the other hand must flow through the cache and into either
the buffers of the requesting job or be cached in host controlled
main memory not emulated main memory ( one of our current systems has
a 8GB limit & 4 CPUs, one of the reasons we bought it BTW ). The
transfer rate of current IDE drives (30 MB + ) is greater than
channel attached DASD ( 4.5 MB ) or Escon attached DASD ( 17 MB ).
The addition of RAID ( IDE or SCSI ) would perhaps be better still.

A friend and I were talking one night about why the mainframe
outperformed PCs. After a few brews, I spouted out that it had to be
the access I/O devices had to memory, not necessarily their speed.
After all, where else could you have 16 separate I/Os and multiple
CPU access to memory all at the same time. PCs may have DMA but
doesn't it lock everybody else out of the pool while the line is high?

Phil - again we disavow any and all knowledge of anything stated
above which may give a hint that we have a inkling about any of it

Community email addresses:
Post message: hercules-390-***@public.gmane.org
Subscribe: hercules-390-subscribe-***@public.gmane.org
Unsubscribe: hercules-390-unsubscribe-***@public.gmane.org
List owner: hercules-390-owner-***@public.gmane.org

Files and archives at:
http://groups.yahoo.com/group/hercules-390

Get the latest version of Hercules from:
http://www.conmicro.cx/hercules

Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.

[Non-text portions of this message have been removed]

Dan

2002-11-23 09:40:12 UTC

Post by halfmeg
Hmm, we have stayed away from this thread due to several reasons, but
now feel we must comment.
P/390 is a two processor complex. The 390 processor may issue an I/O
but the i586 processor is the one which accomplishes the actual I/O.
They also have separate memory (some models can use the host box
memory for extended(?) memory IIRC).
A better comparision is a 2 CPU Hercules host with only 1 CPU
assigned in the Hercules config file.

Just to clarify, I never actually compared the performance of the
two machines. I merely said that, since IBM feels it is worthwhile
having large disk arrays and a dedicated I/O processor to go with a
7 MIPS CPU, perhaps that should be a bit of a clue that I/O
performance is important on emulated mainframes too. The fact that
they had a dedicated I/O processor was one of my points, and
supports the view that I/O performance greater than the average PC
would likely benefit the scalability of an emulated mainframe
solution.

If you read my other posts, you saw that I also recommended at least
two processors.

Post by halfmeg
transfer rate of current IDE drives (30 MB + ) is greater than
channel attached DASD ( 4.5 MB ) or Escon attached DASD ( 17

MB ).

Post by halfmeg
The addition of RAID ( IDE or SCSI ) would perhaps be better still.

Again, I hadn't meant to compare speeds of individual devices. I was
trying to point out that parallel device operation is, in my
opinion, more important than the speed of one device when it comes
to scalability. This is particularly true of DASD, since a disk
drive must usually physically seek every time it switches tasks, so
having fewer tasks per DASD unit is a very important scalability
concern.

Post by halfmeg
A friend and I were talking one night about why the mainframe
outperformed PCs. After a few brews, I spouted out that it had to be
the access I/O devices had to memory, not necessarily their speed.

I think it's their parallelism--the sheer number of independent
devices that can do different things at the same time on behalf of
different applications.

After all, with a bus-mastering host adapter (either SCSI or ATA
RAID), the PC achieves the ability to offload disk processing to a
subprocessor that has direct access to physical memory.

The difference is most PCs don't have more than two or three drives,
where most mainframes have dozens, or even hundreds. That's even
true with small systems like the P/390. It only has 7 MIPS, but it
is the size of a small refrigerator with power, cooling, and space
for all of those drives and other devices.

Back in the late 70s and early 80s, when I was in college, my school
had a large Honeywell 6000 system with nodes in 4 different cities.
There were hundreds of terminals scattered around the state. Our
campus had several different computer labs, and each of them had
dozens of terminals. Plus there were terminals in all of the offices
and staff facilities. It was the same in the other cities. There
were also several remote job entry stations on each campus, each
with a reader/punch, and one or two high speed drum printers.

I'm sure that, by todays standards, the DASD units were quite slow.
All of our terminals were either 300 baud ADM3s or 400 baud
Decwriters.

Still, even with that old, slow hardware, I could log on at any time
of the day or night, on any terminal, in any city, and I would get
about the same response from the system as any other time. From
periods of extreme usage (finals week when you had to wait in line
for a terminal), to the middle of the night, there was really a
remarkably small variance in the responsiveness of the system. Since
I was a CS student, I was using it to do stuff like FORTRAN
compiles, which require significant resources.

Mind you, even when there was nobody on, I wouldn't describe the
system has snappy. In the middle of the night when it was just me
and the night operator, each command might still take a half a
second to respond. Even when it did respond, at 300 baud it was
pretty slow. What's amazing is that the slowness was really quite
acceptable. Having the screen messages come out at 300 baud, and
waiting a half second after each command didn't stop me from doing
about the same amount of work as I do today with a dedicated PC. A
large percentage of an individual's time in an interactive session
is think time. The speed of the terminal didn't stop the machine
from working fast. It only meant it couldn't present information
very fast. If you wanted a large data set, you would simply send it
to a high speed printer.

The main point is that there is a big difference between the
turnaround time of a single request and the ability of the system to
process a lot of requests at once. The average PC can do a FORTRAN
compile much faster than that Honeywell system did, even when it had
practically no load. But it would take a medium-sized mainframe,
even by todays standards, to scale to that size. The difference is
that system had hundreds of disk and tape devices, many
communications controllers, and so forth. It is the parallelism, not
the speed, that makes a system like that capable of doing a lot of
work. That is what you are paying for in the mainframe.

Post by halfmeg
After all, where else could you have 16 separate I/Os and multiple
CPU access to memory all at the same time. PCs may have DMA but
doesn't it lock everybody else out of the pool while the line is high?

Since the CPU is typically running in high speed caches, an I/O
device can be writing to memory while the CPU is doing something
else as long as the CPU doesn't incur a cache miss during the I/O
operation. I don't know for sure how it works on the mainframe. I
suspect it's different on different models. It most likely isn't
much different than the PC on smaller systems.

System performance is a complicated subject because systems are made
up of different parts that run at different speeds. Even main
storage and the CPU usually run at radically different speeds. One
key to designing a highly scalable, high-performance system is to
account for those different speeds in the design, and to allow each
piece of hardware to run at its own full speed, asynchronously to
the devices it communicates with that may be slower or faster.

The mainframe architecture does this quite a bit better than the PC
at the hardware level, but I suspect a PC is capable of a lot of
system parallelization if you put one together with the right
components in the right configuration.

--Dan

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

halfmeg

2002-11-23 23:50:33 UTC

Post by halfmeg
Hmm, we have stayed away from this thread due to several reasons,
but now feel we must comment.
P/390 is a two processor complex. The 390 processor may issue an
I/O but the i586 processor is the one which accomplishes the
actual I/O. They also have separate memory (some models can use
the host box memory for extended(?) memory IIRC).
A better comparision is a 2 CPU Hercules host with only 1 CPU
assigned in the Hercules config file.

Just to clarify, I never actually compared the performance of the
two machines. I merely said that, since IBM feels it is worthwhile
having large disk arrays and a dedicated I/O processor to go with a
7 MIPS CPU, perhaps that should be a bit of a clue that I/O
performance is important on emulated mainframes too. The fact that
they had a dedicated I/O processor was one of my points, and
supports the view that I/O performance greater than the average PC
would likely benefit the scalability of an emulated mainframe
solution.
If you read my other posts, you saw that I also recommended at
least two processors.

I must have been confused as to what you were attempting to
demonstrate in one of your posts paragraph quoted below:

"On my single proc PC running Hercules under Win2K, I'm getting about
10 MIPS. Our company used to lease a P/390 with 7 MIPS that had
probably 20 times the I/O capacity of this PC. That suggests to me
that perhaps the scales are not as proportional as we may think. 10
MIPS should be able to be shared by 100 users no problem. If I tried
to do that on my PC (with, say, TSO users), I would run into the I/O
bottleneck fairly quickly. The P/390 didn't have that limitation.
That says to me that, to get the most work out of your Hercules
system, it is necessary to put some money into the I/O."

The P/390 in the 320 Server case has a max of 6 drive slots. It was
generally configured with one SCSI adapter. The i586 was not a
dedicated I/O processor, it was the HOST processor for a guest
processor, the 390 Chip. Not only did the i586 (max at the time 133
Mhz) provide the I/O but executed the micro/milli code to enable the
390 Chip to function. So everything was bottle necked through the
memory (max 256MB) of the i586 host. All I/O to the 390 was memory
to memory, no waits on physical devices unless the data was depleted
from:
1. 390 main memory
2. 390 buffers (if utilized)
3. i586 main memory
4. i586 Host OS I/O buffers
5. SCSI cache
Think this is correct, but others may know more or better than I.

The P/390 in the 500 Server case has a max of 18 drive slots. I have
never seen one that had the third bay occupied. So 12 SCSI drives
would be on one SCSI adapter (they could address 15 devices on one
channel). The RAID adapter had multiple channels so bay 1 (6
devices) were on channel 1, bay 2 on channel 2 and the SCSI CD taggin
along on one or the other channels. Even though you have two
channels now there is still a possible (probable) bottleneck with a
single adapter on a single bus to the memory where the data needs to
be delivered. Two adapters might help but the single bus still
limits transfer speed and may only permit 1 adapter to master DMA at
a time.

The 390 card is just another MCA or PCI card to insert into a bus
slot. It must contend with access to the host's memory to fill its
own. I presume the memory to memory speed of transfer for requested
I/O blocks is what makes the difference you experienced. There may
also be the difference in the OS running. MVS 5.2.2 and OS/390 1.x
may have taxed the P/390 much less than say OS/390 2.8 or z/OS would.

Enough of this, I have rambled long enough about it. On to the
Laundry analogy.

"One of my favorite analogies is the laundry. If you were designing a
public laundry facility that could handle 6 customers per hour, would
it be better to have one washer that completes a load in 10 minutes,
or 6 washers, each of which can complete a load in an hour, assuming
the cost is the same either way?" - Dan

How could the costs be the same. There has to be sufficient
resources to handle 6 times the amout of electricity drawn at one
time versus spread out over 6 consecutive washings. Same with
heating or providing the hot water. And then there is the water
itself. Larger pipes are needed to bring in 6x water resources at
one time. Larger gas pipes needed to supply 6x. Larger electric
service for 6x. All of these are higher priced by external vendors
based not only on consumption but ability to provide capacity.

Phil - have gotten tired so I'm going, but enjoy the exchanges

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

Dan

2002-11-25 19:56:11 UTC

Post by halfmeg
Enough of this, I have rambled long enough about it. On to the
Laundry analogy.

(snip)

Post by halfmeg
How could the costs be the same. There has to be sufficient
resources to handle 6 times the amout of electricity drawn at one
time versus spread out over 6 consecutive washings. Same with
heating or providing the hot water. And then there is the water
itself. Larger pipes are needed to bring in 6x water resources at
one time. Larger gas pipes needed to supply 6x. Larger electric
service for 6x. All of these are higher priced by external vendors
based not only on consumption but ability to provide capacity.

Hi Phil,

Interesting observations about the P/390. Your knowledge of it
exceeds mine. We had the case with 18 drive slots, and I believe ours
were all full, but I'm not absolutely sure.

With respect to the laundry analogy, it is a hypothetical question.
Assume the fast washer took the same resources (electricity, hot
water) per load as the slow washer. It just somehow managed to use
the resources to get the clothing clean more quickly. Assume the fast
washer's cost is six time that of a slow washer. Given those
assumptions, which is better. That's the hypothetical question.

Thanks for the discussion,
--Dan

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

Fish

2002-11-25 20:56:22 UTC

[laundry analogy hypothetical question]

This is obviously a "trick question" with an obvious point to be made
in explaining what the "correct" answer is, so I for one would prefer
it if you would simply answer it (the question) and make it (the
point). ;-)

As for my own answer, I would tend to think that having multiple
washing machines would be better since that would tend to reduce the
likelihood of any randomly arriving customer from having to wait for
very long for a free washing machine to become available before they
can start their wash, although I can't honestly explain why that
might/would be true.

- --
"Fish" (David B. Trout)
fish-6N/dkqvhA+***@public.gmane.org

Dan

2002-11-25 22:48:26 UTC

Post by Fish
This is obviously a "trick question" with an obvious point to be made
in explaining what the "correct" answer is, so I for one would

prefer

Post by Fish
it if you would simply answer it (the question) and make it (the
point). ;-)

Well, I didn't mean it to be a trick question. To me it's a way of
thinking about the difference between having one high-capacity thing
or a lot of low-capacity things in parallel. It's related to whether
it's better to have a single 2 GHz processor or two 1 GHz processors
if the cost is equal, or whether it would be preferred to have one 80
GB disk drive that can move a lot of data per second, or 10 8 GB
drives that can each move 1/10 as much data per second.

Post by Fish
As for my own answer, I would tend to think that having multiple
washing machines would be better since that would tend to reduce the
likelihood of any randomly arriving customer from having to wait for
very long for a free washing machine to become available before they
can start their wash, although I can't honestly explain why that
might/would be true.

Exactly.

The difference is the parallelism. When independent processing
resources (a washer is a "processing resource", right?) operate in
parallel, the result *tends* to be more scalable than a single
resource with greater processing capacity (in this case we are
talking about in terms of customers per hour).

In the washer scenario, we can think of a number of real-world
problems:

Firstly, a customer coming in to do laundry will have to stand in a
queue behind 5 other customers. In the laundry case, the customer
must stand in line for the fast washer. If there were a slow washer
free, the customer could simply start the wash and then do something
else for an hour while it is washing. Secondly, when the customer
reaches the front of the queue, they must get their laundry into the
machine as quickly as possible. If they pause to do any preparation
as they are putting their clothing into the machine, the entire queue
waits during that pause. Then they must wait poised for their load to
finish for the same reason. The time it takes each customer to
retrieve the results (clean laundry) from the resource is time that
others have to wait. The laundry operator would have to provide a lot
of extra staging space for customers to do all of their
preparation "offline" before entering the queue, and for them to
gather and postprocess (fold, hang, whatever) the results in the same
manner.

The difference between the two approaches can be abstracted into
context switching overhead. There is an amount of preparation that
must be done for the resource to change to a new job, preparation
done starting a request, and preparation done finishing a request. In
the serialized case, the overhead of those activities affects
everyone waiting in the queue multiple times. If my request is 10th
in line, I must wait through 9 other requests, plus 9 other context
switches, with all of the overhead associated. If the resource is a
disk drive, my request must wait through 9 other seek and search
operations, followed by the seek and search operation for itself.
Also, the time required to feed the request to the hardware (e.g.
processing CCWs), and the time required to get the request back from
the hardware (i.e. the interrupt and wait for all of that data to be
transferred across the channel into main storage). In the parallel
case, it's just executing the channel instructions once, one seek and
search, and then transfer the data. If the resource is a CPU, there
would be that number of process context switches to wait through,
with associated dispatching overhead and what not. In a single
processor system, a low priority process might wait for another
entire program to run. If there were multiple processors available,
that process wouldn't wait like that unless all of them were occupied.

Anyway, that's the meaning of the analogy.

Of course, we're talking about large scale parallel environments. If
you were thinking of which would be better for a single user, the
faster one would be more advantageous. Then again, you would have to
justify the cost if it were only going to be used by a single user.
If an hour turn around time is sufficient, a lot of washers that can
do a load in an hour is a more scalable, efficient way to service a
lot of customers. On the other hand, during very slow periods, you
have to still be satisfied with an hour turn-around time, with all of
the other washers sitting idle. That's the way of it in parallel
systems.

I keep trying to get across the idea that speed and scalability are
not the same thing. Just because a system isn't very snappy
processing a single request doesn't mean it won't scale to lots of
requests at once. In fact, it may be much more efficient and
economical at doing a lot of things at once than a PC with a
lightning-fast CPU and tons of RAM would be. A bus is not as snappy
as a sportscar, but it's much better at getting 40 people across town.

--Dan

Fish

2002-11-26 03:35:41 UTC

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

[laundry analogy "trick question"]

Post by Dan
Well, I didn't mean it to be a trick question.

Well it just seemed like one to me. :)

You know, one of those counter-intuitive type brain teasers?

"Given a choice between <A> or <B>,
which would you choose and why? Nope!
<B>! Because <blah> and <blah>! Ha!
Fooled you!"

That sort of thing. :)

If you're at all familiar with the [American] T.V. show from years
ago called "Let's Make a Deal!", I've got a doozy for you. ;-)

(But since such things are really off topic for this list I'll
refrain from any further discussion on the subject.)

<snip>

Post by Dan
Exactly.
The difference is the parallelism. When independent processing
resources (a washer is a "processing resource", right?) operate in
parallel, the result *tends* to be more scalable than a single
resource with greater processing capacity (in this case we are
talking about in terms of customers per hour).

<snip remainder>

Well yes, but I was sort of hoping for some solid evidence
(mathematical proof?) as to why that might be, because, ignoring your
pre-staging and post-staging side of it, I'm having a difficult time
seeing what difference it would make. Both can service only 6
customers per hour after all, so why is having 6 slower machine
better than one very fast one? Is it the pre-staging & post-staging
time issue what makes one preferable over the other? Is that the only
factor? Or are there others?

Are there any mathematicians out there that can explain
mathematically which is better?

- --
"Fish" (David B. Trout)
fish-6N/dkqvhA+***@public.gmane.org

-----BEGIN PGP SIGNATURE-----
Version: PGP 7.0.4

iQA/AwUBPeLsDEj11/TE7j4qEQIh6QCgiNy4oq3zVGNuz1/wuB9AXS+izjUAoIyU
GhyW0tJFZvUSVLFpcGAypUca
=PNSC
-----END PGP SIGNATURE-----

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

jeffsavit

2002-11-26 15:21:30 UTC

Actually, based on my dimly remembered queueing theory from graduate
school, it's really the other way around. The expected service time
(queueing time plus actual "getting service" time) is lower and the
aggregate queue lengths are lower with a single fast server than N
servers whose speed adds up to N. That's why you line up at the
airport with one long line for all the agents instead of one line
per agent. Think of the capacity lost any moment you have less than N
currently dispatchable work units: with N-way case you have idle
washing machines/airline agents/CPUs, but not with the single queue.

If you built a uniprocessor with 1,000 MIPS it would outperform a
10-way box using 100 MIPS engines (takes a deep breath) *assuming*
everything else was equal (eg: there was no unaccounted-for
performance penalty like scheduler overhead, cold TLB or cache, etc).
In the real world these effects do occur, as well as the limitation
of how fast the fastest possible uniprocessor possible is at given
time (and its price), so parallelism is necessary.

If somebody really is motivated for me to look up the math background
(if I haven't thrown it all out), I can go look up my notes on M/M/1
and M/M/N queueing and all that other stuff. But, please don't! The
equations are clear on the subject, but I haven't looked at them
in a LONG time.

cheers, Jeff

Post by Greg Smith
<snip>

Post by Dan
Exactly.
The difference is the parallelism. When independent processing
resources (a washer is a "processing resource", right?) operate in
parallel, the result *tends* to be more scalable than a single
resource with greater processing capacity (in this case we are
talking about in terms of customers per hour).

<snip remainder>
Well yes, but I was sort of hoping for some solid evidence
(mathematical proof?) as to why that might be, because, ignoring your
pre-staging and post-staging side of it, I'm having a difficult time
seeing what difference it would make. Both can service only 6
customers per hour after all, so why is having 6 slower machine
better than one very fast one? Is it the pre-staging & post-staging
time issue what makes one preferable over the other? Is that the only
factor? Or are there others?
Are there any mathematicians out there that can explain
mathematically which is better?

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

Dan

2002-11-26 18:32:30 UTC

Post by jeffsavit
Actually, based on my dimly remembered queueing theory from graduate
school, it's really the other way around. The expected service time
(queueing time plus actual "getting service" time) is lower and the
aggregate queue lengths are lower with a single fast server than N
servers whose speed adds up to N. That's why you line up at the
airport with one long line for all the agents instead of one line
per agent.

Actually, that's the parallel scenario I described. You have n
processing resources (agents), and one queue waiting to use them.
It's the same in the laundry. You have 6 washers. If they are all in
use, customers line up in a single queue for the next available one.
They don't queue up behind the individual units.

A more apt analogy in the airport would be is it faster to have 10
agents, or one that can process a check-in in 1/10th the normal time.
The answer again would be 10 agents, and that's due to the wait time
for each customer to walk up to the counter, find his or her ticket,
load his or her bags onto the scale, search wallet or purse for photo
ID, answer the security questions, etc.

--Dan

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

zapzap50

2002-11-26 19:27:36 UTC

Post by Dan
Actually, that's the parallel scenario I described. You have n

I discovered once that parallel scenarii could lead you to believe
that 9 women could make a baby in only one month . <grin>
Bruno

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

jeffsavit

2002-11-26 22:26:09 UTC

Certainly - since the actual "service time" at an agent
is "agent doing something" + "customer doing something" you can't
make the service time to process an individual "work unit" 10x faster.
The customer overhead you describe could be likened to context
switching overhead on a computer (if we want to pursue the analogy)

If you *could* do 10x (hence my qualifications in the first post I
made on the subject), then the numbers would work to the advantage
of the single-superfast server. However, the pure queueing theory
model breaks down for reasons like the ones you mention. For a
parallelizable workload you can add another server instead of
replacing one, and there is cost to context switch, it provides
redundancy and so on. Yet, many applications have constraints to
parallelization, like CICS driving only one CPU.

As with so many performance situations, "your mileage may vary".
Thank goodness, otherwise what would we have left to discuss? :-)

regards, Jeff

Post by Dan
A more apt analogy in the airport would be is it faster to have 10
agents, or one that can process a check-in in 1/10th the normal time.
The answer again would be 10 agents, and that's due to the wait time
for each customer to walk up to the counter, find his or her ticket,
load his or her bags onto the scale, search wallet or purse for
photo ID, answer the security questions, etc.
--Dan

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

s***@public.gmane.org

2002-11-26 21:05:02 UTC

Post by jeffsavit
Actually, based on my dimly remembered queueing theory from graduate
school, it's really the other way around. The expected service time
(queueing time plus actual "getting service" time) is lower and the
aggregate queue lengths are lower with a single fast server than N
servers whose speed adds up to N. That's why you line up at the
airport with one long line for all the agents instead of one line
per agent. Think of the capacity lost any moment you have less than N
currently dispatchable work units: with N-way case you have idle
washing machines/airline agents/CPUs, but not with the single queue.

The main reason one queue, lots of servers works better is that the time
to service isn't constant. At the airport checkin, some have no luggage,
some have a little, some have excess baggage.

Post by jeffsavit
If you built a uniprocessor with 1,000 MIPS it would outperform a
10-way box using 100 MIPS engines (takes a deep breath) *assuming*
everything else was equal (eg: there was no unaccounted-for
performance penalty like scheduler overhead, cold TLB or cache, etc).
In the real world these effects do occur, as well as the limitation
of how fast the fastest possible uniprocessor possible is at given
time (and its price), so parallelism is necessary.

Depends on thw workload. I used to work for Amdahl. We had a big
thumper, bigger than anything IBM had. With one particular client we
were especially safe from IBM - The Application used CICS and didn't
benefit from multiple CPUs.

With multiple CPUs you have more servers for this kind of task, and the
multiple CPU approach wins.

If the sum of the many is notionally equal to the one it's not so clear,
but then that circumstance tends not ot occur in computers, and if it
did there's probably be a clear decision on other grounds.
--
Cheers
John.

Join the "Linux Support by Small Businesses" list at
http://mail.computerdatasafe.com.au/mailman/listinfo/lssb

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

Dan

2002-11-26 18:23:08 UTC

Post by Fish
Well yes, but I was sort of hoping for some solid evidence
(mathematical proof?) as to why that might be, because, ignoring your
pre-staging and post-staging side of it, I'm having a difficult time
seeing what difference it would make. Both can service only 6
customers per hour after all, so why is having 6 slower machine
better than one very fast one? Is it the pre-staging & post-staging
time issue what makes one preferable over the other? Is that the only
factor? Or are there others?

I am no expert in queue theory. For me, the serialization of context
switching overhead, as well as serial use of access mechanisms
constitute enough mathematical proof. There are a bunch of practical
issues too, like it's easier to scale a system up by adding parallel
processing resources incrementally than by replacing existing serial
resources with faster ones (i.e. you can add another washer if your
load goes up to 7 customers per hour, but in the single-washer system
it is necessary to replace the existing one with a faster one).
Configurations can be varied (i.e. you can have different kinds of
washers that are good at different kinds of processing). And so on.

--Dan

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

Willem Konynenberg

2002-11-26 00:26:32 UTC

Post by Fish
As for my own answer, I would tend to think that having multiple
washing machines would be better since that would tend to reduce the
likelihood of any randomly arriving customer from having to wait for
very long for a free washing machine to become available before they
can start their wash, although I can't honestly explain why that
might/would be true.

A, nice, queueing theory. ;-)

Why does your dentist always have way more people waiting in his
waiting room than is necessary to bring the risk of him running
idle waiting for the next patient down to acceptable levels?
Answer: because he doesn't understand queueing theory, so he
errs on the safe side, from his perspective, and waiting
patients are hardly a cost factor.

The trick is to figure out how many washing machines your
really *need* to bring the risk of any customer having to
wait for too long down to an acceptable level, thus balancing
cost against product quality, to optimise profit.

--
Willem Konynenberg <w.f.konynenberg-/NLkJaSkS4VmR6Xm/***@public.gmane.org>
Konynenberg Software Engineering

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

Dan

2002-11-26 00:37:49 UTC

Post by Willem Konynenberg
The trick is to figure out how many washing machines your
really *need* to bring the risk of any customer having to
wait for too long down to an acceptable level, thus balancing
cost against product quality, to optimise profit.

Precisely! The same holds for CPUs, DASD units, I/O channels, or
whatever. No simple task, to be sure.

--Dan

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

s***@public.gmane.org

2002-11-26 01:27:37 UTC

Post by Willem Konynenberg
The trick is to figure out how many washing machines your
really *need* to bring the risk of any customer having to
wait for too long down to an acceptable level, thus balancing
cost against product quality, to optimise profit.

Precisely! The same holds for CPUs, DASD units, I/O channels, or
whatever. No simple task, to be sure.

And installing software. The time you spent choosing Red Hat Linux
packages would, at consultant rates, have bought you an 80 Gbyte dis drive.

And there you were, fretting over a couple of gigabytes.

The time to rune your instalaltion is when you know what you want to do.
In the interim, choose a desktop install, software developers' tools abd
be done with it. Simple, fairly quick, and all you need (and more) for
Herculean tasks. 500 Mbytes 4 Gbytes of disk, does it really matter?

On another list, someone reported auditing two sites, fairly similar in
what they did. However, one site used a lot more disk space than the
other.

He examined the matter, and he found the one using less disk worked
really hard at getting the most out of it and using stringent
housekeeping procedures.

The other favoured the "throw another disk in" technique.

The costs of storage for the site that worked really hard at optimising
disk storage were higher, not just per Gigabyte but in total.
--
Cheers
John.

Join the "Linux Support by Small Businesses" list at
http://mail.computerdatasafe.com.au/mailman/listinfo/lssb

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

Dan

2002-11-26 01:52:08 UTC

Post by s***@public.gmane.org
And installing software. The time you spent choosing Red Hat Linux
packages would, at consultant rates, have bought you an 80 Gbyte dis drive.
And there you were, fretting over a couple of gigabytes.

That's a fair point to be sure. I ended up doing exactly what you
suggested and taking a more or less standard installation with
development tools.

--Dan

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

Greg Smith

2002-11-26 00:38:52 UTC

Post by Willem Konynenberg

Post by Fish
As for my own answer, I would tend to think that having multiple
washing machines would be better since that would tend to reduce the
likelihood of any randomly arriving customer from having to wait for
very long for a free washing machine to become available before they
can start their wash, although I can't honestly explain why that
might/would be true.

A, nice, queueing theory. ;-)
Why does your dentist always have way more people waiting in his
waiting room than is necessary to bring the risk of him running
idle waiting for the next patient down to acceptable levels?
Answer: because he doesn't understand queueing theory, so he
errs on the safe side, from his perspective, and waiting
patients are hardly a cost factor.
The trick is to figure out how many washing machines your
really *need* to bring the risk of any customer having to
wait for too long down to an acceptable level, thus balancing
cost against product quality, to optimise profit.

Indeed. I ran into a real life situation some number of years ago. Our
machine (3090-400) was maxed out (123% busy). I looked at the problem
and concluded that 10 initiators instead of 30 would keep the machine
100% busy and batch throughput would actually increase (due to a decrease
in dispatcher/swap overhead). Didn't fly tho, because users preferred
to see their jobs in an initiator rather than queued for one, regardless of
the elapsed time. So, patients in the waiting room *know* they'r'e going
to see the dentist eventually, while they might not be so sure if they're
not there.

Greg

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

John Alvord

2002-11-23 23:24:25 UTC

Post by Willem Konynenberg
I read a lot of statements that are, in principle, generally true,
but I don't see what point you are trying to make, and I don't
see where the fundamental "misconception" is.

(snip)

Post by Willem Konynenberg
So, although your statements are basically true, I would like
to know what exactly the point is you are trying to make.

Somebody asked about hints for performance on Hercules systems
(hardware choices, etc.)
I said that I would go in favor of a decent DASD I/O setup instead of
spending a lot of money on, say, main storage.
Somebody (mvt) suggested that perhaps another way to improve the
capacity of the system would be to buy a lot of memory instead of a
good I/O subsystem, and devote the extra memory to caching.
The misconception is that adding cache will increase the capacity of
your system in the same way that better I/O performance will. For all
of the reasons previously stated, cache will make certain things more
efficient in a system that tends to refer to the same data over and
over, but it does not obviate the need to read and write the data,
and its effectiveness goes down as you have more and more
simultaneous tasks running on the system due to the fact that system
locality of reference decreases with each additional task.

Post by Willem Konynenberg
From your post, it seems clear that you don't hold this

misconception. But then I never accused you of holding it.
I was merely expressing a response to the opinion that buying a lot
of memory is a better bet for performance than buying good DASD
hardware (SCSI, lots of disk arms, etc.) Though I disagree with that
opinion, I also find it is a *very* common misconception.
On my single proc PC running Hercules under Win2K, I'm getting about
10 MIPS. Our company used to lease a P/390 with 7 MIPS that had
probably 20 times the I/O capacity of this PC. That suggests to me
that perhaps the scales are not as proportional as we may think. 10
MIPS should be able to be shared by 100 users no problem. If I tried
to do that on my PC (with, say, TSO users), I would run into the I/O
bottleneck fairly quickly. The P/390 didn't have that limitation.
That says to me that, to get the most work out of your Hercules
system, it is necessary to put some money into the I/O.

Could you really support 100 TSO users on a P/390 7 mip machine? My
experience was that even a single logon got pretty slow performance.
My company (Candle Corp) had one to do Y2K testing, and starting up
the main system task took 40-50 min compared to 3-4 minutes elapsed on
our real mainframe... even with 50-100 competing tasks. My conclusion
at the time was that (like most PC setups) the I/O was much less
powerful. Running XDC testing was impossibly slow... so we ended up
doing end result testing.

john alvord

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

Fish

2002-11-23 23:49:57 UTC

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Post by John Bellomy
-----Original Message-----

<snip>

Post by John Bellomy
Could you really support 100 TSO users on a P/390 7 mip machine? My
experience was that even a single logon got pretty slow
performance. My company (Candle Corp) had one to do Y2K testing,
and starting up the main system task took 40-50 min compared to 3-4
minutes elapsed on our real mainframe... even with 50-100 competing
tasks. My conclusion at the time was that (like most PC setups) the
I/O was much less
powerful. Running XDC testing was impossibly slow... so we ended up
doing end result testing.

FWIW, we (Associated Credit Services, Inc.[1], out of Houston, TX,
back in '77-'86) ran a national credit reporting network with well
over 2500+ terminals accessing two multi-million record databases
(one with about 22-25 million records in it, the other with about
150-165 million) on much less powerful hardware: a 370-158 which I
believe is rated at only 1 MIP (and it only had 3MB real memory).

- --
"Fish" (David B. Trout)
fish-6N/dkqvhA+***@public.gmane.org

[1] "The Pinger System"; now defunct.

-----BEGIN PGP SIGNATURE-----
Version: PGP 7.0.4

iQA/AwUBPeAUJEj11/TE7j4qEQKJQwCgjQS2O0MYb1ypKTiqq4Q32kvzzX4AoLbO
wp3HXpy76ipJBr7kfTtmsdzO
=mpwg
-----END PGP SIGNATURE-----

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

s***@public.gmane.org

2002-11-24 00:07:24 UTC

Post by Fish
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Post by John Bellomy
-----Original Message-----

<snip>

Post by John Bellomy
Could you really support 100 TSO users on a P/390 7 mip machine? My
experience was that even a single logon got pretty slow
performance. My company (Candle Corp) had one to do Y2K testing,
and starting up the main system task took 40-50 min compared to 3-4
minutes elapsed on our real mainframe... even with 50-100 competing
tasks. My conclusion at the time was that (like most PC setups) the
I/O was much less
powerful. Running XDC testing was impossibly slow... so we ended up
doing end result testing.

FWIW, we (Associated Credit Services, Inc.[1], out of Houston, TX,
back in '77-'86) ran a national credit reporting network with well
over 2500+ terminals accessing two multi-million record databases
(one with about 22-25 million records in it, the other with about
150-165 million) on much less powerful hardware: a 370-158 which I
believe is rated at only 1 MIP (and it only had 3MB real memory).

Department of Social Security, Australia, implemented Medibank on dual
3168 (initially running SVS, but possibly MVS by the time it went live).

In addition to the production workload (around 20 million records stored
on a brace of 3330-1s), the machines were used for program development
for Medibank and several other major DSS applications.

The department was running (but not very quickly) 70 concurrent TSO
sessions.

According to my chart, 3168-3-MP offered 4.52 MIPS. Maximum data rate
per channel was 1.5 Mbytes/sec, 3.0 aggregate.

Even my Pentium 133 system can match the data rate, though I don't know
what it can actually deliver to a guest hercules. I think it's in the
emulation that the PC might miss out, but I'd expect an SMP system to
cope well.

The chart was posted to this list on April 29/4/2001.
--
Cheers
John.

Join the "Linux Support by Small Businesses" list at
http://mail.computerdatasafe.com.au/mailman/listinfo/lssb

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

John Alvord

2002-11-24 00:21:13 UTC

Post by Fish
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Post by John Bellomy
-----Original Message-----

<snip>

Post by John Bellomy
Could you really support 100 TSO users on a P/390 7 mip machine? My
experience was that even a single logon got pretty slow
performance. My company (Candle Corp) had one to do Y2K testing,
and starting up the main system task took 40-50 min compared to 3-4
minutes elapsed on our real mainframe... even with 50-100 competing
tasks. My conclusion at the time was that (like most PC setups) the
I/O was much less
powerful. Running XDC testing was impossibly slow... so we ended up
doing end result testing.

FWIW, we (Associated Credit Services, Inc.[1], out of Houston, TX,
back in '77-'86) ran a national credit reporting network with well
over 2500+ terminals accessing two multi-million record databases
(one with about 22-25 million records in it, the other with about
150-165 million) on much less powerful hardware: a 370-158 which I
believe is rated at only 1 MIP (and it only had 3MB real memory).

I believe it. At that same timeframe (74-78) I worked for The
Analytic Sciences Corp (TASC) and we had a 158-3 (toward the end)
which supported a lot of work including designing the GPS system under
a navy contract. And cruise missiles. [TASC had about 100 Phds doing
this sort of stuff.] TASC was one of many companies doing this work,
of course.

I often wonder what the PCs we use are actually doing. My theory is
that they spend a lot of MIPs and I/O bandwidth shuffling GUI windows
back and forth... and fiddling with "bloated" C++ programs that take
100,000 instructures to figure out what a particular mouse event
means.

john alvord

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

s***@public.gmane.org

2002-11-24 00:33:08 UTC

Post by John Alvord
I often wonder what the PCs we use are actually doing. My theory is
that they spend a lot of MIPs and I/O bandwidth shuffling GUI windows
back and forth... and fiddling with "bloated" C++ programs that take
100,000 instructures to figure out what a particular mouse event
means.

My PC at home:
[***@orange summer]$ procinfo
Linux 2.4.18-14custom (root-YpOBQiCxnrbDzxZ6uZ0Ep2TrgBVAv+***@public.gmane.org) (gcc 3.2 20020903 ) #2
Tue Oct 22 08:06:20 WST 2002 1CPU [orange.Summerfield]

Memory: Total Used Free Shared Buffers
Cached
Mem: 125960 117348 8612 0 704
34652
Swap: 195316 101052 94264

Bootup: Sat Nov 23 18:03:07 2002 Load average: 0.00 0.05 0.09 1/159
23148

user : 2:06:00.63 2.9% page in : 2415139 disk 1: 300715r
131813w
nice : 0:00:43.03 0.0% page out: 2153816 disk 2: 4219r
0w
system: 0:31:18.40 0.7% swap in : 37830
idle : 2d 22:39:15.94 96.4% swap out: 55541
uptime: 3d 1:17:18.00 context : 78145671

irq 0: 135085059 timer irq 8: 1 rtc
irq 1: 132296 keyboard irq 9: 0 acpi
irq 2: 0 cascade [4] irq 10: 3197602 eth0
irq 3: 3 irq 11: 0 usb-uhci
irq 4: 3 irq 12: 3534962 PS/2 Mouse
irq 5: 2386301 ESS Solo1 irq 14: 430650 ide0
irq 6: 6 irq 15: 13098 ide1

Spends 96% of the time doing nothing. This is a Pentium II 350.
Office server, P II 233,
idle : 28d 17:53:40.24 96.0% swap out: 0
Office desktop, Athlon 1.4:
idle : 14d 22:55:06.01 95.9% swap out: 208977 disk 4: 396r
0w

The justification for that much excess CPU power is that we don't have
to wait long when there's something to do.

--
Cheers
John.

Join the "Linux Support by Small Businesses" list at
http://mail.computerdatasafe.com.au/mailman/listinfo/lssb

Dan

2002-11-25 20:11:04 UTC

Post by John Alvord
I often wonder what the PCs we use are actually doing. My theory is
that they spend a lot of MIPs and I/O bandwidth shuffling GUI

windows

Post by John Alvord
back and forth... and fiddling with "bloated" C++ programs that take
100,000 instructures to figure out what a particular mouse event
means.

Well, having spent a good deal of my career developing C and C++ code
for Windows and UNIX, I can say with some authority that you are
right about that.

Windows and UNIX are both built up in layers of software. Each layer
adds some overhead and calls the layer below. The systems were not
designed for efficiency, mainly because they did not originally have
to scale up very far. They were designed to accommodate small numbers
of users, doing a small number of things.

Consider the way strings are represented as just one example,
Mainframes (and minis) tend to represent them as fixed sized blocks
of memory, just like scalars. The hardware can be designed to
manipulate them (move them around, etc.) quickly. C/C++ represents
strings as null-terminated sequences of bytes of arbitrary length.
Nearly every operation requires the bytes to be processed
sequentially rather than as a block, in software rather than in
hardware. Since they can be arbitrarily sized, copying a string can
involve walking its bytes to calculate its length, calling an
allocator routine to dynamically allocate memory for the copy, and
then copying the bytes one at a time. When you are finished with the
copy, another allocator routine must be called to free the memory.
Clearly, compounded over the whole system and all application
software, inefficiencies like these make it so the machine cannot do
as much work. The null-terminated string is only one example, there
are many others (e.g. stream files, C++ objects, Java, etc.) That's
before you even get to the GUI.

Consider the fact that I am sitting here typing a message while my
machine runs a motion video advertisement. The machine resources used
to display the advertisement dwarf those used for the actual work I
am doing.

I hate to sound like a conspiracy theorist, but I think some of this
absurdity is related to the need hardware vendors have to find ways
for people to need more hardware capability so they keep buying
newer, faster processors, more memory, etc. The hardware available in
the modern PC is many times more capable than 99% of people will ever
need for real work. The challenge facing the PC industry is how to
stay healthy in a market where supply far exceeds demand. The last
thing they want is for it to become a commodity, but that really is
where it's going. Maybe, after that storm blows over, the industry
will go back to believing efficiency is worthwhile.

--Dan

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

John Alvord

2002-11-25 20:58:37 UTC

Post by John Alvord

Post by John Alvord
I often wonder what the PCs we use are actually doing. My theory is
that they spend a lot of MIPs and I/O bandwidth shuffling GUI

windows

Post by John Alvord
back and forth... and fiddling with "bloated" C++ programs that take
100,000 instructures to figure out what a particular mouse event
means.

Well, having spent a good deal of my career developing C and C++ code
for Windows and UNIX, I can say with some authority that you are
right about that.
Windows and UNIX are both built up in layers of software. Each layer
adds some overhead and calls the layer below. The systems were not
designed for efficiency, mainly because they did not originally have
to scale up very far. They were designed to accommodate small numbers
of users, doing a small number of things.
Consider the way strings are represented as just one example,
Mainframes (and minis) tend to represent them as fixed sized blocks
of memory, just like scalars. The hardware can be designed to
manipulate them (move them around, etc.) quickly. C/C++ represents
strings as null-terminated sequences of bytes of arbitrary length.

That is certainly the default way to define them. It isn't actually
necessary, if course. S/390 even has some byte oriented instructions
(derived from 1401 instruction set) like TRT which make it very fast
to scan for binary zeroes. And you can always keep the length around
and do memcpy() calls, which turn into MVC or MVCL.

Post by John Alvord
Nearly every operation requires the bytes to be processed
sequentially rather than as a block, in software rather than in
hardware. Since they can be arbitrarily sized, copying a string can
involve walking its bytes to calculate its length, calling an
allocator routine to dynamically allocate memory for the copy, and
then copying the bytes one at a time. When you are finished with the
copy, another allocator routine must be called to free the memory.
Clearly, compounded over the whole system and all application
software, inefficiencies like these make it so the machine cannot do
as much work. The null-terminated string is only one example, there
are many others (e.g. stream files, C++ objects, Java, etc.) That's
before you even get to the GUI.

One interesting aspect of early 360 history involves MVT (which begat
SVS, which begat MVS, which begat OS/390, which begat zOS...). It had
great flexibility in using DASD hardware.. you could specify key
length and block length... and the key could be searched independently
of the data. A lot of the actual processing could live in the channel.
That was a vital necessity when processors were slow and channel
processing was relatively fast. In CP/67 (which begat VM/370) the
decision was made to use fixed size disk blocks (initially 800 bytes
and later 4096 blocks). The result was blazingly fast I/O - disk files
could be created and written out with very little effort. In CMS (user
interface in CP-land) the directory was read into storage during
"mount" so there was no need to fiddle with messy disk structures. The
minidisk areas were formatted once, so channel programs were a lot
faster. And when DASD got fancier... able to disconnect during
rotation.. CP/CMS could make use of it immediately because the
calculations were straightforward.

Post by John Alvord
Consider the fact that I am sitting here typing a message while my
machine runs a motion video advertisement. The machine resources used
to display the advertisement dwarf those used for the actual work I
am doing.
I hate to sound like a conspiracy theorist, but I think some of this
absurdity is related to the need hardware vendors have to find ways
for people to need more hardware capability so they keep buying
newer, faster processors, more memory, etc.

I reject the conspiracy theory... processing got so much cheaper that
people searched out new applications. Use of C/C++ is much more
people-efficient, and the cost of people has risen... so you optimize
in the natural way which is to use more (cheaper) hardware to reduce
the overall cost.

Post by John Alvord
The hardware available in
the modern PC is many times more capable than 99% of people will ever
need for real work. The challenge facing the PC industry is how to
stay healthy in a market where supply far exceeds demand. The last
thing they want is for it to become a commodity, but that really is
where it's going. Maybe, after that storm blows over, the industry
will go back to believing efficiency is worthwhile.
--Dan
http://groups.yahoo.com/group/hercules-390
http://www.conmicro.cx/hercules
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

Dan

2002-11-25 19:49:20 UTC

Post by John Alvord
Could you really support 100 TSO users on a P/390 7 mip machine? My
experience was that even a single logon got pretty slow performance.
My company (Candle Corp) had one to do Y2K testing, and starting up
the main system task took 40-50 min compared to 3-4 minutes elapsed on
our real mainframe... even with 50-100 competing tasks. My

conclusion

Post by John Alvord
at the time was that (like most PC setups) the I/O was much less
powerful. Running XDC testing was impossibly slow... so we ended up
doing end result testing.

Well, IBM said a P/390 was sized for about 90 software developers to
share. Based on a limited amount of time using the box, I'd guess
that it would probably do that no problem if they were TSO users
running traditional mainframe development programs (ISPF/PDF, etc.),
and if each was compiling maybe 5% of the time. That seems a pretty
reasonable real-world scenario.

The problem that we ran into was in trying to do things that are much
less efficient, like logging in under OMVS and developing and running
C++ programs. Those kinds of tasks are much less CPU efficient, so
the 7 MIPS limitation was the bottleneck in our case.

Still, I do think it is telling that IBM discontinued the P/390,
replaced it with the Multiprise line, and the latter has a direct
pipe from the mainframe memory to the main disk array. They probably
observed the same thing you did--the PC I/O subsystem was a
performance limiting factor.

--Dan

mvt

2002-11-22 17:31:09 UTC

On Fri, Nov 22, 2002 at 06:01:06AM -0000, Dan wrote:
(snip)

Post by Dan
It seems to me that the misconception lies there. Most people
generally believe that the way to do a lot of work with their
computers is to figure out how to make their computers complete each
task as quickly as possible (i.e. they think performance equals
scalability). In my experience, it's much more complicated than that.
Scalability means the system can handle 'n' tasks simultaneously, and
each of those tasks will complete in an acceptable amount of time
from the point of view of the user waiting for it. It's not generally
true that reducing the turnaround time for each task will result in
your system being able to do the most possible work at one time.

Hi Dan,

There are two schools of tuning/capacity planning philosphy. I'll call
them the Push Camp and the Pull Camp.

The Push guys use RMF to figure out how well they are using the
processor, shark, memory, bandwidth, whatever . They look for ways
keep their resource buckets full all the time. They believe that a half
empty bucket is a half wasted bucket. They are intimately familiar
with IBM's FinPack, can calculate standard deviations in ther sleep,
and tend to have jelly beans in crystal glassware (donated by the
accounting and finance department) on their desk.

The Pull guys use Candle (or similar) to figure out how to pull
more water out of a leaky boat using using a bucket. They are not
so concerned with the size of the bucket as they are avoiding
drowning should the water arrival rate exceed the bucket bail rate.
When things get dicey they look for a bigger bucket, more buckets, or
a mop. A drowning Pull guy may sometimes inflict damage on the
bucket monopoly with his oar.

I am an 'ole slow southern boy who believes that all truth in the
world can be reduced to two fundamental rules: 1) The only measurement
that counts is results achieved (everthing else is rhetoric), and
2) All computers in the world tend to wait at the same speed.

Capacity planning/tuning based on other than task arrival rate and task
waits is, imho, a complete waste of time. Growth means only that the
leak in the boat has grown larger (the water arrival rate) and that my
exposure to drowning (i.e. waiting for buckets) has increased. Minimizing
the wait is the key to salvation.

Waits happen because "it" (some kind of resource as in processor, i/o,
memory, bandwidth, people, whatever) is not available. Too much waiting
requires that "it" be "fixed". There are two ways to "fix" "it":
(1) "get more" of "it" and/or (2) "need less" of "it".

If "it" is I/O, then we would be wise look to both the "get more" and
"need less" options. I seems to me that you might be fixated on the
on the "get more" option. Cache may be a helpful way to achieve
"need less".

The rate at which we deliver/consume/burn/scale/whatever "it" (resource)
(within reasonable bounds) is insignificant compared to the question
"What is the 'it' that is constraining the business". I acknowledge
a personal bias. I couldn't care less how many tasks are running
concurrently. What is important to me is that all of the work (be it
batch, TSO, CICS, IMS, whatever) gets "done" in the time allotted.

We operate under the motto "absolutely, positively whatever it takes
to deliver (on time)" ... I think most businesses are in a similar
position.

--
Reed H. Petty
rhp-VSgbLoB6MtCsTnJN9+***@public.gmane.org

floyds_void

2002-11-22 18:35:20 UTC

Post by Dan
(snip)

Post by Dan
It seems to me that the misconception lies there. Most people
generally believe that the way to do a lot of work with their
computers is to figure out how to make their computers complete each
task as quickly as possible (i.e. they think performance equals
scalability). In my experience, it's much more complicated than that.
Scalability means the system can handle 'n' tasks simultaneously, and
each of those tasks will complete in an acceptable amount of time
from the point of view of the user waiting for it. It's not generally
true that reducing the turnaround time for each task will result in
your system being able to do the most possible work at one time.

I am an 'ole slow southern boy who believes that all truth in the
world can be reduced to two fundamental rules: 1) The only measurement
that counts is results achieved (everthing else is rhetoric), and
2) All computers in the world tend to wait at the same speed.

The points Dan makes are very valid. He's looking down at hercules
from a bird's eye view and I'm looking up at it from the trenches
(I guess that's a gopher's eye view ;-) What is bothering me, and
I know it shouldn't, and I apologize for that, is that a lot of the
points he is making we have already examined in excruciating
detail over countless hours. As Dan learns the insides of hercules
his observations will become more and more valuable, and I
encourage him to keep it up.

One area where there might be some performace gain is to examine
where we do so-called AIA and AEA flushes. The AIA/AEA is where we
save absolute instruction and data addresses so we can cheat during
address translation. We might be able to flush these a little
more intelligiently in some spots, saving translation path length.

Greg

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

Dan

2002-11-22 23:34:48 UTC

Post by floyds_void
The points Dan makes are very valid. He's looking down at hercules
from a bird's eye view and I'm looking up at it from the trenches
(I guess that's a gopher's eye view ;-) What is bothering me, and
I know it shouldn't, and I apologize for that, is that a lot of the
points he is making we have already examined in excruciating
detail over countless hours. As Dan learns the insides of hercules
his observations will become more and more valuable, and I
encourage him to keep it up.
One area where there might be some performace gain is to examine
where we do so-called AIA and AEA flushes. The AIA/AEA is where we
save absolute instruction and data addresses so we can cheat during
address translation. We might be able to flush these a little
more intelligiently in some spots, saving translation path length.

Hi Greg,

I can see what you are saying. You are coming from a position of
having spent a long time working with Hercules, and it seems to you
like I am coming in from the outside and offering a lot of criticism
right away, without understanding the innards of the system.

I apologize for giving that impression. I thought we started off
talking about steps that could be taken to tune a particular Hercules
system to perform well as an emulated mainframe, and my remarks were
meant to offer general observations about performance management,
along with how we can draw some conclusions from those about how to
put together hardware for Hercules systems, configure Linux and
Hercules, etc.

I wouldn't presume to say I think Hercules is inefficient or anything
like that. On the contrary, it has sparked my renewed interest in the
mainframe. I think it is a wonderful tool, and I think it has great
potential to be used for actual production work rather than just
as "a system programmer's toy". My earlier questions about hand
optimization were only meant to inquire as to whether people had
thought about how far its performance might be pushed. I never meant
to sound critical.

Regards,
--Dan

Dan

2002-11-22 23:27:41 UTC

Post by mvt
I am an 'ole slow southern boy who believes that all truth in the
world can be reduced to two fundamental rules: 1) The only

measurement

Post by mvt
that counts is results achieved (everthing else is rhetoric), and

I agree with that.

Post by mvt
2) All computers in the world tend to wait at the same speed.

I agree with that.

Post by mvt
Capacity planning/tuning based on other than task arrival rate and task
waits is, imho, a complete waste of time.

I disagree with that.

Correct me if I'm wrong, but it seems to me like you are viewing a
computer system as one box that can do one thing at a time. It sounds
like you are saying that if the box can turn around one thing faster,
it can automatically handle more things per
second/minute/hour/whatever.

That is the "misconception" I spoke of in my prior email.

The reality of the situation is that the computer is many boxes, each
of which can do some number of things at a time, each of which takes
a different amount of time to do a thing.

Let's try to make some concrete examples

Suppose you have a single CPU only--no disk drives. Every task it
does is 100% CPU bound. In this case, your view would be right on.
The faster the CPU could do a task, the more tasks per time period it
would be able to handle. Do a task with fewer CPU cycles, or make the
CPU go faster, and you improve performance. Period. Very simple world.

Now, what if you are going to increase the capacity of your machine.
Your normal workload involves multiple units of work arriving at
once, and you must queue them up and feed them into the CPU for
processing. You have the option of either making your CPU go twice as
fast, or adding a second CPU. Which is better?

Well, adding a second CPU is usually better because it reduces the
amount of task switching that any CPU must do (by half in this case),
which frees more cycles to do the actual work, which improves the
rate at which you can empty the boat, right?

Now, when you add that second CPU, have you actually made any unit of
work go faster?

A single unit of work will run on one of the CPUs, and it will take
as much time as it always took. So you have doubled your capacity
without improving the turnaround time of any single task.

Clearly, the turnaround time of a task is not the only factor to
consider in the capacity of a system, right?

Now, suppose you add DASD to the equation. This is where it starts
getting more complicated.

Say the arrival rate of tasks is 10 per second, and you are just
keeping up with each task taking 1/10 of a second to complete. Now
suppose that if you store a task's data on disk instead of in memory,
it will take 1 second instead of 1/10 of a second because it will
spend the other 9/10 of a second waiting for the disk I/O to complete.

Post by mvt
From what you said about your way of reckoning it, you would never do

this, because you would think that you are going to drown if you get
10 tasks a second and each takes 1 second to complete. But, in
reality, it is not so.

Imagine you add 10 DASD units to your computer, and set the system up
so that it distributes the work between them evenly. The key to
understanding this performance dynamic is to realize that the CPUs
and DASD operate asynchronously to one another. That means that the
CPU doesn't wait for the DASD request to complete. It keeps right on
working, switching to a different task, and comes back to the present
task only when the DASD request has completed.

Now, imagine 10 requests come in. It goes something like this:

1. The CPU gets the first request, initiates the DASD I/O for it, and
then that request goes to sleep waiting for the result.

2. But the CPU does not go to sleep. It gets the second request,
initiates its DASD to a *different* unit, and then that task goes to
sleep. You now have two different disk units performing DASD accesses
in parallel.

3. The CPU gets the next request... and so forth.

10 requests come in, 10 DASD drives service the requests at the same
time. Each takes 9/10 of a second, but they are all working at the
same time, so the entire operation only really takes 9/10 of a
second. During that 9/10 of a second, As each DASD request finishes,
the CPU switches to that task, finishes it up, switches to the next
task, finishes it up, and so forth. In order for the CPU to process
10 requests per second it need only do 1/10 of a second of CPU work
for each request. All of the rest is DASD work, which doesn't stop
the CPU from doing other things.

The end result is, even though the turnaround time for a single
request has gone up, the system can still do the same amount of work
as before. But why would we do such a thing?

Easy. DASD is *way* cheaper per byte than main storage, and even a PC
can easily address terabytes of DASD, but it is limited to only 4 GB
of RAM. DASD is a much cheaper and easier way to scale up the storage
requirements of a system than RAM.

The key to making it work for this task is I/O parallelization. In
other words, it is necessary to have 10 drives in our hypothetical
situation, all processing separate requests at once. Because of that,
you would do way better in a Hercules system to have 15 4 GB SCSI
drives than a single 60 GB IDE drive, even though the 60 GB would
probably provide a faster access time. It's the parallelization that
matters.

Adding cache RAM doesn't increase the I/O parallelization. The system
can use it to improve overall efficiency by caching the things that
are accessed most often, but nearly all commercial applications deal
with large volumes of data in short periods of time. By definition,
that kind of processing can't be made faster by ordinary caching,
since it must still be read in and written out to the disks where it
resides. Caching only improves subsequent accesses of something
that's already been recently accessed.

Regards,
--Dan

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

John Alvord

2002-11-23 23:17:18 UTC

Post by Greg Smith

Post by mvt
I am an 'ole slow southern boy who believes that all truth in the
world can be reduced to two fundamental rules: 1) The only

measurement

Post by mvt
that counts is results achieved (everthing else is rhetoric), and

I agree with that.

Post by mvt
2) All computers in the world tend to wait at the same speed.

I agree with that.

Post by mvt
Capacity planning/tuning based on other than task arrival rate and

task

Post by mvt
waits is, imho, a complete waste of time.

I disagree with that.
Correct me if I'm wrong, but it seems to me like you are viewing a
computer system as one box that can do one thing at a time. It sounds
like you are saying that if the box can turn around one thing faster,
it can automatically handle more things per
second/minute/hour/whatever.
That is the "misconception" I spoke of in my prior email.

Sorry coming a bit late to the party.

Any performance analysis of a real system has to take into account the
system architecture. For example, a bit limiting factor on commodity
PCs is the bus speed. If it is 33mhz, 66 mhz, 100mhz - that will place
an absolute limit on performance regardless of how fast and/or
parallel the disks are. On high end intel architectures, there are
multiple buses to overcome this limitation.

The same analysis would apply to memory bandwidth... how many bytes
per second the memory system can deliver. A couple times a year people
show up on Linux-Kernel list worrying about the fact that a certain
benchmark runs at one speed on a UP, but the same hard configured as a
dual processor, two benchmarks each run at half speed. The answer is
that the first benchmark fully consumed the memory bandwidth and
having two processors didn't change that. Higher end multi-processors
increase the bandwidth through various expensive means.

In multi-processors, the problems of coordinating the CPU cache
hardware is more or less expensive.

The size of the L1 and L2 (and sometimes L3) CPU cache can make a big
difference. A workload handling a certain size matrix, can drop
dramatically when the size increases just slightly.

My bottom line is that performance meaurement and prediction is
intimately related to system architecture and if that is ignored,
through ignorance or simplification, the results become much less
meaningful.

john alvord

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

Jeffrey R. Broido

2002-11-24 14:53:14 UTC

John et al,

I have been following all of these posts, mumbling to myself. Yours
finally struck enough of a chord to elicit this general response.

My bottom line is that each case is different; each requires full
analysis. In other words, we should dispense with all rules-of-
thumb and use our grey matter effectively. I've done a great deal
of performance tuning in my time. My best result involved
circumventing an extremely severe operating system design flaw.
There were no rules-of-thumb covering this possibility. There were
no appropriate Cheryl Watson newsletters. There was no conventional
wisdom to be applied. But the result was spectacular, reducing peak
period interactive response from about a minute to sub-second.

I do agree with you wholeheartedly, that hardware architecture must
be takin into account. I guess I'm just saying that not only that
but everything must be taken into account when tuning or, for that
matter, when doing almost anything in this or any other field.

To me, the phrase "conventional wisdom" has always been an
oxymoron. If Roger had believed the conventional wisdom, there
would be no Hercules, today. If the analyists at IBM who directed
the virtualization of OS/360 MVT had believed the infamous and still
sadly pervasive "74% CPU utilization is the limit" rule-of-thumb,
MVS might not now be able to run happily at 100%. If my old friend,
Bob Schrieber, had believed IBM back in 1973 when they discouraged
him from writing an on-the-fly DASD compressor, we would have had to
wait years for Innovation to write theirs.

So, then, my only bit of conventional wisdom is that we should
eschew all conventional wisdom, even our own, and use our brains,
for the unexpected is, has always been and probably always will be
an integral part of this business.

Regards,
Bomber

<big snip> My bottom line is that performance meaurement and
prediction is intimately related to system architecture and if
that is ignored, through ignorance or simplification, the results
become much less meaningful.

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

Jeffrey R. Broido

2002-11-24 15:49:50 UTC

John et al,

I have been following all of these posts, mumbling to myself. Yours
finally struck enough of a chord to elicit this general response.

My bottom line is that each case is different; each requires full
analysis. In other words, we should dispense with all rules-of-
thumb and use our grey matter effectively. I've done a great deal
of performance tuning in my time. My best result involved
circumventing an extremely severe operating system design flaw.
There were no rules-of-thumb covering this possibility. There were
no appropriate Cheryl Watson newsletters. There was no conventional
wisdom to be applied. But the result was spectacular, reducing peak
period interactive response from about a minute to sub-second.

I do agree with you wholeheartedly, that hardware architecture must
be taken into account. I guess I'm just saying that not only that
but everything must be taken into account when tuning or, for that
matter, when doing almost anything in this or any other field.

To me, the phrase "conventional wisdom" has always been an
oxymoron. If Roger had believed the conventional wisdom, there
would be no Hercules, today. If the analyists at IBM who directed
the virtualization of OS/360 MVT had believed the infamous and still
sadly pervasive "74% CPU utilization is the limit" rule-of-thumb,
MVS might not now be able to run happily at 100%. If my old friend,
Bob Schrieber, had believed IBM back in 1973 when they discouraged
him from writing an on-the-fly DASD compressor, we would have had to
wait years for Innovation to write theirs.

So, then, my only bit of conventional wisdom is that we should
eschew all conventional wisdom, even our own, and use our brains,
for the unexpected is, has always been and probably always will be
an integral part of this business.

Regards,
Bomber

<big snip> My bottom line is that performance meaurement and
prediction is intimately related to system architecture and if
that is ignored, through ignorance or simplification, the results
become much less meaningful.

Cam Farnell

2002-11-24 16:23:18 UTC

For anyone who is interested I note that there is a control panel from a
360/40 currently for sale on ebay, item number 1920197243

http://cgi.ebay.com/ws/eBayISAPI.dll?ViewItem&category=26210&item=1920197243&rd=1

--
Cam Farnell
Kingston, Ontario, Canada

John Summerfield

2002-11-20 02:31:40 UTC

Post by Joseph M. DeAngelo
irst allow me to congratulate the Hercules team on a job very well
done. As a veteran of running DOS and VS1 under VM from the early
70's, I can see right away that this is a big step forward from that
sort of thing. When I could hit the enter key on a telnet session to
the S/390 Linux rnuning under hercules and see it come back with an
immediate prompt, I was impressed. CMS running under VM running
under VM was no where near as fast.
I could also see that any command which involved I/O to dasd was
suffering. As someone who has written operating systems for
mainframes long ago, I can testify to the enormous number (thousands)
of instructions needed to translate any I/O request into a channel
program and run that program to a successful conclusion. I saw
someone else estimate that it takes an average of 200 native
instructions to emulate a single S/390 instruction. It would seem
that the greatest gain in performance would be obtained in cutting
out that log jam.

I first ran hercules on a Pentium 133. I estimated its CPU performance at
about equivalent to a 370/148, though I don't think you could get one with 16
Mbytes of RAM.

I/O performance far exceeds what was available then.

If you have a licence to run zOS, then that's different, of course.
--
Cheers
John Summerfield

Microsoft's most solid OS: http://www.geocities.com/rcwoolley/
Join the "Linux Support by Small Businesses" list at
http://mail.computerdatasafe.com.au/mailman/listinfo/lssb

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Share the magic of Harry Potter with Yahoo! Messenger
http://us.click.yahoo.com/4Q_cgB/JmBFAA/46VHAA/W4wwlB/TM
---------------------------------------------------------------------~->

tom balabanov

2002-11-20 04:26:05 UTC

for what it was worth we migrated off an IBM 9221 (10mips) 'big' hardware to 2 flexes machines,
if you think that the big mainframes have hardware IO advantages I didn't notice, the jobs ran
between 1.5 and 2x faster i.e. a 2 hour job would take between 1 and 1.5 hrs on the new hardware
a lot of the reason is more caching for IO
As for hercules I can manage about 1/2 the speed/performance on a standard PC i.e. my laptop

----- Original Message -----
From: Joseph M. DeAngelo
To: hercules-390-***@public.gmane.org
Sent: Tuesday, November 19, 2002 9:59 AM
Subject: [hercules-390] Performance observations please

I'd like to know if anyone has done performance analyses on typical
Hercules environments, i.e. running under an Intel based host. Are
there performance related how-to's located somewhere? I'm new to
this and would like to share my observations.

First allow me to congratulate the Hercules team on a job very well
done. As a veteran of running DOS and VS1 under VM from the early
70's, I can see right away that this is a big step forward from that
sort of thing. When I could hit the enter key on a telnet session to
the S/390 Linux rnuning under hercules and see it come back with an
immediate prompt, I was impressed. CMS running under VM running
under VM was no where near as fast.

I could also see that any command which involved I/O to dasd was
suffering. As someone who has written operating systems for
mainframes long ago, I can testify to the enormous number (thousands)
of instructions needed to translate any I/O request into a channel
program and run that program to a successful conclusion. I saw
someone else estimate that it takes an average of 200 native
instructions to emulate a single S/390 instruction. It would seem
that the greatest gain in performance would be obtained in cutting
out that log jam.

I am running Hercules on a 256M K6-2/400 AMD processor running Suse 8
Linux. I chose that environment because I feel that the Linux/Intel
kernel is more efficient than the Win2K. My Hercules environment
consists of a 128M machine with a 1G dasd to house the / filesystem,
a 1.8G drive to house /home, and a 200M dasd for the swap. I realize
that this is a very unimpressive initial hardware situation.

I ran a port our uni-Rexx product. Normally, under a similar
Linux/Intel system, the port and the corresponding QA backend would
take less than 15 minutes wall time. My run last night consumed
about 10 hours running SuSE 7.0 S/390 within my Hercules
environment. THis is a big difference although I fully recognize
that I have managed to avoid buying an expensive mainframe with a
system worth about $100 and a copy of Suse 8.0 that cost me about
$40. TO a large degree I got what I paid for.

I would appreciate any performance tips. I know I should buy a real
computer for starters.

I'd also like to expand on my observation concerning I/O perforance.
I've been a VM person since CP/67 was offerred by Boeing Systems as a
timesharing environment. As early as VM/370, this attempt to emulate
I/O was recognized as a significant bottleneck. IBM came out with an
enhancement for running OS/VS1 under VM that required OS/VS1 to
recognize that it was running in a virtual machine ( via a quirk in
the Store CPUID instruction). OS/VS1 would then send page I/O
directly to the VM hypervisor via a Diagnose instruction. This
eliminated the construction of CCW programs and their subsequent
decoding by VM.

I wonder if Hercules does the same or a similar trick with the Store
CPUID instruction that would permit it's guests to know that they are
running under Hercules which would allow tha DIAG instruction to be
used for a similar purpose.

When my SuSE 7.0 S/390 wants to read data from dasd, if it knows that
it is running on an Intel linux based host, the request could be more
efficiently translated, i.e. an fread() in the S/390 system could
conceiveably be translated into a diagnose instruction to the
Hercules hypervisor which, in turn, would have the mapping data
needed to satisfy the I/O request with its own direct fread() call.

Anyway, I recognize that my simplistic concept would require a lot
more work than my words might imply, but I think that it is the
truest path to enhanced Hercules performance.

Community email addresses:
Post message: hercules-390-***@public.gmane.org
Subscribe: hercules-390-subscribe-***@public.gmane.org
Unsubscribe: hercules-390-unsubscribe-***@public.gmane.org
List owner: hercules-390-owner-***@public.gmane.org

Files and archives at:
http://groups.yahoo.com/group/hercules-390

Get the latest version of Hercules from:
http://www.conmicro.cx/hercules

Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.

[Non-text portions of this message have been removed]

Dan

2002-11-20 20:49:35 UTC

Post by tom balabanov
for what it was worth we migrated off an IBM 9221 (10mips) 'big'

hardware to 2 flexes machines,

Post by tom balabanov
if you think that the big mainframes have hardware IO advantages
I didn't notice,

I think big mainframe hardware can scale up bigger. If I recall
correctly, the 9221 is at the small end of the 9000 line. Even so,
the number of external devices that it can SUPPORT is much greater
than any PC, and most UNIX boxes. Now, it may be that your
installation didn't have all of those devices, so the FLEX-ES
solution provided better performance with the I/O hardware available
to it. On smaller scales, PC I/O hardware has much better
price/performance than mainframe I/O hardware. When you reach a
certain point, though... You can't reasonably build a PC with 100
physical disk drives, etc.

The solution to that problem in the FLEX world would be channel
adapters and real mainframe hardware. At that point, I would simply
view the FLEX machine as a smallish mainframe CPC, and the emulator
as microcode, IOCDS, etc. It would be no different than any other
plug-compatible processor complex.

--Dan

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/JjlUgA/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

Greg Smith

2002-11-20 23:38:43 UTC

I suppose you've made it pretty clear in your post that you don't get
much value from my insights.

Dan,

You are right and I was very wrong. I regretted sending the post the
instant I did.

Please accept my apologies, and please keep your insights coming.
They give us all something to think about.

Greg

Dan

2002-11-21 02:33:56 UTC

Post by Greg Smith
Please accept my apologies, and please keep your insights coming.
They give us all something to think about.
Greg

Thanks Greg. No problem. I appreciate your apology.

Regards,
--Dan

Graham Goodwin

2002-11-24 23:45:50 UTC

For what it's worth as a contribution ...

<snip>

Post by John Alvord
Could you really support 100 TSO users on a P/390 7 mip machine? My
experience was that even a single logon got pretty slow
performance. My company (Candle Corp) had one to do Y2K testing,
and starting up the main system task took 40-50 min compared to 3-4
minutes elapsed on our real mainframe... even with 50-100 competing
tasks. My conclusion at the time was that (like most PC setups) the
I/O was much less
powerful. Running XDC testing was impossibly slow... so we ended up
doing end result testing.

<snip>

Reflects my experience with one of those things; took circa 30+ minutes to
ipl MVS 5.2.2, then struggled terribly with more than one person using it.
OTOH, my larger machine at home (2 Pentium III 933 MHz, 512MB, 2 * 36G SCSI
drives (10,000 rpm) with hardware RAID 0, uncompressed DASD, Hercules
mippage ~ 7-9 with "real" work) does the same trick in ... just over 3
minutes.

Conclusion: SMP good, decent I/O better.

Cheers,
Graham

John Alvord

2002-11-25 06:51:42 UTC

On Sun, 24 Nov 2002 23:45:50 -0000, "Graham Goodwin"

Post by Graham Goodwin
For what it's worth as a contribution ...
<snip>

Post by John Alvord
Could you really support 100 TSO users on a P/390 7 mip machine? My
experience was that even a single logon got pretty slow
performance. My company (Candle Corp) had one to do Y2K testing,
and starting up the main system task took 40-50 min compared to 3-4
minutes elapsed on our real mainframe... even with 50-100 competing
tasks. My conclusion at the time was that (like most PC setups) the
I/O was much less
powerful. Running XDC testing was impossibly slow... so we ended up
doing end result testing.

<snip>
Reflects my experience with one of those things; took circa 30+ minutes to
ipl MVS 5.2.2, then struggled terribly with more than one person using it.
OTOH, my larger machine at home (2 Pentium III 933 MHz, 512MB, 2 * 36G SCSI
drives (10,000 rpm) with hardware RAID 0, uncompressed DASD, Hercules
mippage ~ 7-9 with "real" work) does the same trick in ... just over 3
minutes.
Conclusion: SMP good, decent I/O better.

As always, it depends on the workload. Whatever rule of thumb is
presented, someone can come up with a workload that contradicts the
rule.

Good benchmarking keeps most things constant and alters specific
aspects... like memory.

[Funny memory. I attended a talk in (1975?) by Romney White, at
conference at the University of Waterloo. His talk was about how he
did benchmarking to show that a 2305 (Fastest IBM DASD at the time)
would be valuable for their timesharing environment. "It took me two
weeks of benchmarking to produce the figures that proved it would be
useful." In fact he had a strong gut feeling it would be valuable...
and it wasn't very useful until VM/370 was tweaked to use it
efficiently. Benchmarking was only used to satisfy the requirements of
administrators.]

john alvord

john alvord

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

John Summerfield

2002-11-25 07:14:27 UTC

Post by John Alvord
did benchmarking to show that a 2305 (Fastest IBM DASD at the time)

We had one of those for our 168s.

It was _not_ use for paging!
--
Cheers
John Summerfield

Microsoft's most solid OS: http://www.geocities.com/rcwoolley/
Join the "Linux Support by Small Businesses" list at
http://mail.computerdatasafe.com.au/mailman/listinfo/lssb

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

John Alvord

2002-11-25 07:17:08 UTC

On Mon, 25 Nov 2002 15:14:27 +0800, John Summerfield

Post by John Summerfield

Post by John Alvord
did benchmarking to show that a 2305 (Fastest IBM DASD at the time)

We had one of those for our 168s.
It was _not_ use for paging!

What did you use it for, HASP checkpoint?

john

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

John Summerfield

2002-11-25 08:57:29 UTC

Post by John Alvord

Post by John Summerfield
It was _not_ use for paging!

What did you use it for, HASP checkpoint?

We were seriously into data compression. Peoples' name were compressed into
32-bits, two 13-bit indexes into the names database, and six bits for the
middle initial.

The names database was on the "drum."

Were were indexing the names of about 22 million people (including
duplicates).

There were some pretty funny-looking names; there was aaagaret for starters
(and with the triple-A it would have been close to starter;-)).

--
Cheers
John Summerfield

Microsoft's most solid OS: http://www.geocities.com/rcwoolley/
Join the "Linux Support by Small Businesses" list at
http://mail.computerdatasafe.com.au/mailman/listinfo/lssb

Willem Konynenberg

2002-11-25 21:07:52 UTC

Post by Dan
In the early days of Linux, it was really touted as not having the
bloat of Windows, being lean and mean, and so forth. Having just
installed RedHat 7.3, I would say it doesn't seem to work that way
any more.

Install RHL 7.3 and XP on the same system. Assume each takes 2 Gbytes of
disk (my standard install of RHL does) Which allows you to do more?
How _small_ a useful XP system can you create?

As a data point to compare: I have recently built a stripped-down
Linux system based on Debian with a total size of some 12 MB.
That includes Apache and Samba.

I think that with some effort, you can still put a "classic Unix"
style system based on Linux on a 10 MB hard disk.

And then I added two moderate applications written in C++ by people
with an obvious Windows background, and that was another 13 MB...

--
Willem Konynenberg <w.f.konynenberg-/NLkJaSkS4VmR6Xm/***@public.gmane.org>
Konynenberg Software Engineering

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

Dan

2002-11-25 21:24:38 UTC

Install RHL 7.3 and XP on the same system. Assume each takes 2

Gbytes of

disk (my standard install of RHL does) Which allows you to do more?
How _small_ a useful XP system can you create?

I see your point.

Anyway, I think RHL has become rather similar to Windows. That's the
part I think is ironic, since the Linux community always used to
disdain Windows for being too large and inefficient.

With respect to functionality, I think it's worthwhile to consider
how much a system makes you pay for functionality you *don't* care
about. On one end of the scale are systems like XP and RHL that,
IMHO, are trying to be all things to all people. At the other end
would be something like MVS, which provides exceptionally rich
functionality, and allows you to pay in fine granularity for only
those pieces you are interested in. The drawback is obviously it
takes a lot more work to set up an MVS system that does what you want.

The rest of the UNIX world (AIX, Solaris, etc.) seems to fall
somewhere in the middle. Since UNIX boxes are often servers, these
systems allow you to choose more carefully how you will spend your
system resources, but they have traded some of that away for
simplicity. I had hoped Linux would evolve into a better UNIX, rather
than evolving toward Windows as it seems to have done.

--Dan

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

mvt

2002-11-25 22:45:31 UTC

Post by Dan
Anyway, I think RHL has become rather similar to Windows. That's the
part I think is ironic, since the Linux community always used to
disdain Windows for being too large and inefficient.

(snip)

Post by Dan
The rest of the UNIX world (AIX, Solaris, etc.) seems to fall
somewhere in the middle. Since UNIX boxes are often servers, these
systems allow you to choose more carefully how you will spend your
system resources, but they have traded some of that away for
simplicity. I had hoped Linux would evolve into a better UNIX, rather
than evolving toward Windows as it seems to have done.

If you perceive that RedHat represents the entire Linux spectrum
then either you are woefully ignorant or you are trolling (I suspect
the latter).

Trolls should banished and the ignorant have no basis from which
to contribute.

"It is not the critic who counts; not the man who points out how
the strong man stumbles, or where the doer of deeds could have
done better. The credit belongs to the man who is actually in
the arena, whose face is marred by dust, and sweat, and blood."
-- Theodore Roosevelt.

As for me, thanks to all who have contributed to the Hercules
(and RedHat) environment, your work is appreciated.

Regards,

Dan

2002-11-25 23:45:46 UTC

Post by mvt
If you perceive that RedHat represents the entire Linux spectrum
then either you are woefully ignorant or you are trolling (I

suspect

Post by mvt
the latter).
Trolls should banished and the ignorant have no basis from which
to contribute.

Well, I tried to reply to this before, and it seems to have gone into
the proverbial bit bucket. Sorry if two replies show up to this post.

Whew!

This post seems a bit of an attack to me.

I'm sorry if I gave the impression that I think RedHat is
representative of all Linux distributions. I chose RedHat 7.3 because
Jay recommended it for running Hercules.

My early experiences with Linux were with Debian, and it seemed to
have a lot of promise, but it was a lot more work to set it up and
keep it running than commercial UNIXes.

Nowadays, if you say Linux more is difficult to set up and administer
than, say, Solaris, people tend to sing the praises of RedHat as
evidence that it's not true. If you say RedHat is more bloated and
less efficient than, say, Solaris, people get angry and say RedHat is
not representative of Linux.

Where is the Linux distribution that is easier than Debian, while
being less bloated and inefficient than RedHat?

Anyway, I don't know what "Trolling" is, which I'm sure betrays
ignorance of some facets of net culture, but that's okay. After all,
we all have other things to do as well, right?

Perhaps you can explain "Trolling" to me so I will be enlightened.

I was under the impression that it is okay for me to have and speak
my own opinions, even if they are not the same as yours. For the time
being, I'll continue to assume that's true, and maybe you can explain
where I went wrong. Forgive me, but it sure looks like I just shared
an opinion and then you opened up the flame thrower with an attack on
my knowledge of UNIX, saying I have "no basis from which to
contribute," calling me a "Troll", and saying I should be "banished."
I must say I'm a bit bewildered as to what I did to deserve this
emotionally-charged, bitter personal attack.

Regards,
--Dan

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

Dan

2002-11-26 00:35:01 UTC

Post by Dan
Anyway, I don't know what "Trolling" is, which I'm sure betrays
ignorance of some facets of net culture, but that's okay. After all,
we all have other things to do as well, right?

Hi everyone,

Sorry to reply to my own post. I thought this might be useful to
anyone else who was wondering what "trolling" is.

http://www.altairiv.demon.co.uk/troll/trollfaq.html

Yes, it is off of the topic of Hercules, but then, so was a lot of
that thread.

As I read the page referenced by the URL above (meant to be a
humorous "guide" for trollers), I was a bit embarrassed because I
realized that this long thread started as a discussion of
hardware recommendations for running Hercules, then digressed into
the rather relevant discussion of scalability of hardware in general,
and then digressed further into the rather irrelevant discussion of
whether RHL is bloated, and whether or not it should be taken as
representative of Linux in general.

All of a sudden, we were discussing something having nothing to do
with Hercules, that would be appropriate for a Linux advocacy message
board. While I still don't feel worthy of the violent character
assasination attempt lobbed in my direction, I can see how my off-
topic posts lead the thread down that road.

Please accept my apology. I really enjoy discussions about trends in
system design, especially as they relate to tradeoffs between
resource efficiency and other factors such as ease of use. I
shouldn't have digressed that far off the topic of Hercules. In the
future, I will keep my posts relevant to the subject of this message
board.

Regards,
--Dan

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

Richard Higson

2002-11-26 07:26:17 UTC

Date: Tue, 26 Nov 2002 00:35:01 -0000
Subject: [hercules-390] Trolling defined (was Performance observations please)

Post by Dan
Anyway, I don't know what "Trolling" is, which I'm sure betrays
ignorance of some facets of net culture, but that's okay. After all,
we all have other things to do as well, right?

Yes, it is off of the topic of Hercules, but then, so was a lot of
that thread.

...

Please accept my apology. I really enjoy discussions about trends in
system design, especially as they relate to tradeoffs between
resource efficiency and other factors such as ease of use. I
shouldn't have digressed that far off the topic of Hercules. In the
future, I will keep my posts relevant to the subject of this message
board.
Regards,
--Dan

Dan & others,

The difference between a troll, and what Dan has been learning in the
last several days is that "Trolls don't learn".
Thank you, Dan, for your insights, AND your ability to listen and learn.

On a separate note, I expect to be returning to 'my regular hercules
presence' early coming year, as my current assignment has gone 'live', and
I'm in the process of "getting out of the way of the people who get to
play with the toys I built them."

Lusya is back in the Ukraine, has a new wheel chair and a (used)
electric wheel chair. Those of you that can read German might be
interested in the following URL:
http://www.nw-news.de/news/lokal/lk/NW_20021002_3289130.html

Richard

--
To err is human, but I could be wrong on that, too.
Have a nice day ;-) Richard Higson mailto:richard.higson-qoLni4XNXEKzQB+***@public.gmane.org

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

Fish

2002-11-28 12:32:39 UTC

Richard Higson posted the following to
gmane.comp.emulators.hercules390.general:

<snip>

Post by Richard Higson
On a separate note, I expect to be returning to 'my regular
hercules presence' early coming year,

Yea!!

</me does happy dance>

:)))

Post by Richard Higson
as my current assignment has gone 'live', and
I'm in the process of "getting out of the way of the people
of the people who get to play with the toys I built them."
Lusya is back in the Ukraine, has a new wheel chair and a (used)
electric wheel chair. Those of you that can read German might be
http://www.nw-news.de/news/lokal/lk/NW_20021002_3289130.html

You are an *incredibly* nice person, Richard. I sincerely wish with all
my heart that there were more people like you and Sonja in the world.

All my best to you (and Sonja) in all you do.

:`)
--
"Fish" (David B. Trout)

(spamblocks in place; actual email
is fish (at) infidels (dot) org)

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

s***@public.gmane.org

2002-11-25 22:27:22 UTC

Post by Dan
The rest of the UNIX world (AIX, Solaris, etc.) seems to fall
somewhere in the middle. Since UNIX boxes are often servers, these
systems allow you to choose more carefully how you will spend your
system resources, but they have traded some of that away for
simplicity. I had hoped Linux would evolve into a better UNIX, rather
than evolving toward Windows as it seems to have done.

As I understand it Sun's adopted GNOME in preference to CDE. People
_are_ running GNOME and KDE on Unix. I can download free BSD unix or FOC
Solaris (acutally, I'm not sure it's available for FOC download or you
have to buy it on CD for a nominal charge) and run the same desktops I
can ron on Linux.

It's true the UI is tending towards Windows and in the right fora I have
bemoaned that.

However, on any of these free systems, take a look at what you have in
your 2 Gbytes;

Software development tools for many languages
A top-notch web server
" " MTA
Choice of GUIs
Freedom to not use the GUIs.
Software that actually works.

The stuff that's in Linux is there because people actually use it.

Sure, it's not MVS. OTOH last time I used MVS (aka OS/390) the UI was
still terrible.
--
Cheers
John.

Join the "Linux Support by Small Businesses" list at
http://mail.computerdatasafe.com.au/mailman/listinfo/lssb

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

Dan

2002-11-25 23:58:12 UTC

Post by s***@public.gmane.org
As I understand it Sun's adopted GNOME in preference to CDE. People
_are_ running GNOME and KDE on Unix. I can download free BSD unix or FOC
Solaris (acutally, I'm not sure it's available for FOC download or you
have to buy it on CD for a nominal charge) and run the same

desktops I

Post by s***@public.gmane.org
can ron on Linux.

(snip), etc....

Well, let me just apologize to everyone and go on record as saying
that I don't think that Solaris is better than Linux, or Linux is
better than Solaris, or any system is better than any other system
really.

Clearly it's a complicated subject, and it seems clear as well that
there are deep personal feelings held by many people on this subject.

Personally, I think computers have a lot of different possible
applications, and different system designs are more optimized to
different applications. I like Windows and MacOS for the desktop.
Most of my workstations run Windows in some flavor or another (ME,
XP, 2K, NT4).

I like mainframes and AS/400s for batch processing, and highly
scalable server systems.

I think UNIX boxes in various flavors are more or less suited to
server type applications of a smaller scale than AS/400s and S/390s.
Personally, I don't use UNIX as a client/workstation OS, though I
have many colleagues that do.

On this forum, when I have been talking about UNIX, it has been
strictly from the standpoint of how well it does as a host system for
Hercules. From that standpoint, I have found RedHat to be somewhat
bloated.

Based on my experience with other UNIXes, Linux seems an interesting
case. Don't get me wrong. I'm not disparaging Linux. I just find it
an interesting subject for discussion. There is, to me, some irony to
the fact that the most visible differences between Linux now and
Linux three or four years ago seem to have been advances in the
direction of making it a better client system. Based on my
background, experiences, opinions, or whatever, UNIX seems more
suited for server type work.

Sorry for any ruffled feathers that may have resulted from this. I
think all of these systems are valuable and have their place, and I
have a personal affinity for software efficiency (otherwise why would
I be interested in Hercules, MVS, etc.)? Even though efficient use of
machine resources is just one aspect of system design, it's an
interesting subject nonetheless, and various designs can and should
be evaluated from that perspective (as well as from many others).

Best regards,
--Dan

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

jeffsavit

2002-11-26 14:53:18 UTC

FWIW, Solaris 8 and Solaris 9 both come with GNOME 1.4 in the CD kit,
so if you have an up to date Solaris you get GNOME at no extra charge.
This is for both Solaris/SPARC and Solaris/Intel. You can download the
GNOME 2 Beta 3 from
http://wwws.sun.com/software/star/gnome/beta/get/index.html
I run GNOME on my Solaris and Linux boxes all the time. I never really
liked CDE anyway...(is that an opening for another flame war? ;-)

Cheers, Jeff

Post by s***@public.gmane.org
As I understand it Sun's adopted GNOME in preference to CDE. People
_are_ running GNOME and KDE on Unix. I can download free BSD unix or
FOC Solaris (acutally, I'm not sure it's available for FOC download
or you have to buy it on CD for a nominal charge) and run the same
desktops can ron on Linux.

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

Gregg C Levine

2002-11-26 21:27:04 UTC

Hello from Gregg C Levine
Not unless you want me to borrow Ripley's flamer, or even Chewie's
bowcaster. Jeff, I never liked CDE either. Gnome doesn't like me,
because I rarely remember to set a root password on single user systems,
and also because under Linux, this display adapter becomes a frame
buffer.
-------------------
Gregg C Levine hansolofalcon-XfrvlLN1Pqtfpb/***@public.gmane.org
------------------------------------------------------------
"The Force will be with you...Always." Obi-Wan Kenobi
"Use the Force, Luke." Obi-Wan Kenobi
(This company dedicates this E-Mail to General Obi-Wan Kenobi )
(This company dedicates this E-Mail to Master Yoda )

Post by John Bellomy
-----Original Message-----
Sent: Tuesday, November 26, 2002 9:53 AM
Subject: [hercules-390] Re: Performance observations please
FWIW, Solaris 8 and Solaris 9 both come with GNOME 1.4 in the CD kit,
so if you have an up to date Solaris you get GNOME at no extra charge.
This is for both Solaris/SPARC and Solaris/Intel. You can download the
GNOME 2 Beta 3 from
http://wwws.sun.com/software/star/gnome/beta/get/index.html
I run GNOME on my Solaris and Linux boxes all the time. I never really
liked CDE anyway...(is that an opening for another flame war? ;-)
Cheers, Jeff

Post by s***@public.gmane.org
As I understand it Sun's adopted GNOME in preference to CDE. People
_are_ running GNOME and KDE on Unix. I can download free BSD unix or
FOC Solaris (acutally, I'm not sure it's available for FOC download
or you have to buy it on CD for a nominal charge) and run the same
desktops can ron on Linux.

------------------------ Yahoo! Groups Sponsor
http://groups.yahoo.com/group/hercules-390
http://www.conmicro.cx/hercules
Your use of Yahoo! Groups is subject to

http://docs.yahoo.com/info/terms/
------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

mvandere

2002-11-27 00:27:41 UTC

Post by Gregg C Levine
Jeff, I never liked CDE either. Gnome doesn't like me,

Hmmmm.

We have now had negative comments (in this stream) for KDE, Gnome,
Windows, and (even) the MVS UI.

I assume no one is going to give a plus vote for CLI.

Soooooo it would appear a new UI is needed and all the world's
problems will be solved ('cause we will be able to implement on all
the world's OS's 1 consistent UI!)

Cheers,

Mark

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

Gregg C Levine

2002-11-27 01:31:21 UTC

Hello again from Gregg C Levine
Tsk, Tsk. Quoting me out of context, and not giving me credit. And I
don't have a problem with whatever MVS uses to communicate with humans.
I have never even tried out an MVS image. A CLI? Which one? I've worked
with several versions of DOS, both PC, and MS, and CP/M, and Linux. Also
different versions of UNIX, for the PDP-11, and of course BSD. So, I now
invite a comment, or two.
-------------------
Gregg C Levine hansolofalcon-XfrvlLN1Pqtfpb/***@public.gmane.org
------------------------------------------------------------
"The Force will be with you...Always." Obi-Wan Kenobi
"Use the Force, Luke." Obi-Wan Kenobi
(This company dedicates this E-Mail to General Obi-Wan Kenobi )
(This company dedicates this E-Mail to Master Yoda )

Post by John Bellomy
-----Original Message-----
Sent: Tuesday, November 26, 2002 7:28 PM
Subject: [hercules-390] Re: Performance observations please

Post by Gregg C Levine
Jeff, I never liked CDE either. Gnome doesn't like me,

Hmmmm.
We have now had negative comments (in this stream) for KDE, Gnome,
Windows, and (even) the MVS UI.
I assume no one is going to give a plus vote for CLI.
Soooooo it would appear a new UI is needed and all the world's
problems will be solved ('cause we will be able to implement on all
the world's OS's 1 consistent UI!)
Cheers,
Mark
------------------------ Yahoo! Groups Sponsor
http://groups.yahoo.com/group/hercules-390
http://www.conmicro.cx/hercules
Your use of Yahoo! Groups is subject to

http://docs.yahoo.com/info/terms/
------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

Gordon R. Keehn

2002-11-25 22:04:48 UTC

Post by Dan
Anyway, I think RHL has become rather similar to Windows. That's the
part I think is ironic, since the Linux community always used to
disdain Windows for being too large and inefficient.

The difference being that with Linux (even RHL, which is my distribution of
choice) one has the option of running without all the graphical overhead.
Try running Win2K (the only flavor that remotely approaches Linux in
stability, IMHO) on a 75 mHz Pentium with 48 mB RAM and 2.1 gB DASD. I like
KDE, I can live with GNOME, but if I choose, and have the need, I can build a
minimal system without either which can run rings around any M$ system ever
built. And I can do it by downloading the distribution, which contributes
not one thin copper to Mr. Gates' already-obscene wealth.

Post by Dan
- - -
The rest of the UNIX world (AIX, Solaris, etc.) seems to fall
somewhere in the middle. Since UNIX boxes are often servers, these
systems allow you to choose more carefully how you will spend your
system resources, but they have traded some of that away for
simplicity. I had hoped Linux would evolve into a better UNIX, rather
than evolving toward Windows as it seems to have done.

You sound like you've never used anything but a "maximum-option" linux
environment. I'm not familiar with other UNIX flavors, but Linux is so
blasted customizable and scalable that it takes little effort to put together
a system (you mentioned servers) which has the exact combination of
functionality and usability desired. Of course functionality comes
(relatively) cheap; usability costs quite a bit more. TANSTAAFL.

--

----
Gordon R. Keehn, CPSM Change Team
CICS/390 Service, USA
Gordon Keehn/Raleigh/***@IBMUS, 1-919-254-1690

Dan

2002-11-25 23:03:18 UTC

Post by Gordon R. Keehn
You sound like you've never used anything but a "maximum-option" linux
environment. I'm not familiar with other UNIX flavors, but Linux is so
blasted customizable and scalable that it takes little effort to put together
a system (you mentioned servers) which has the exact combination of
functionality and usability desired. Of course functionality comes
(relatively) cheap; usability costs quite a bit more. TANSTAAFL.

Well, I just went through the exercise of installing RHL 7.3 to run
Hercules. I had an idea of what I wanted to do with it, and a limited
amount of time to get it there.

I chose the custom installation path. Then I selected the basic
things I wanted (which included a graphic environment for x3270,
etc.) Then I went into individual packages and tried to deselect
everything I didn't need. Then I got a long list of package
dependencies. The *first* time I went through the installation, I
tried to go through and carefully resolve them by eliminating things
that depend on other things, etc. That took a couple of hours. Then
the install crashed trying to format the file systems on my RAID
array. After a couple more crashes, I had to try to speed the process
up a bit. In the end, I just lived with most of the junk it put on
there because it was so much work going through thousands of packages
trying to figure out what depended on what, and which part of system
operation each thing was supposed to do.

For example, I've never used xinetd before, but it is fundamental to
a lot of the configuration support provided by RedHat.

Of course you are right that if I wanted to take the trouble to build
a UNIX system completely from scratch, starting with the kernel and
finding the most efficient components I could find to manage the
various aspects, and craft my own set of startup scripts, I could
come up with something much, much smaller than Windows. But I only
set out to run Hercules, so it doesn't seem like I should have to go
through all of that just to get an efficient host platform for my
application.

By contrast, I have set up Solaris on many machines, both Sun
hardware and Intel hardware. It is very easy to create a minimal
system because the dependencies between components are not so
arbitrary as they are in the open source world.

When we talk about how small of a system we can get, I think, for
most people, that is going to mean how small of a system they can get
with a day or two of installation work. RHL, in this respect, is no
worse than Windows, and, IMHO, no better than Windows.

--Dan

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

John Summerfield

2002-11-26 01:46:23 UTC

Post by Dan
I chose the custom installation path.

I did too, the first few times I installed it. It's a waste of time, just
choose a standard configuration that looks plausible.

Of course, when I fist installed RHL the target was a 486XD33, 8 Mbytes of RAM
and 170 Mbytes of disk. I had a web server with CGi support, Xfree, gcc & I
ran my mail service and SOCKS on it.
--
Cheers
John Summerfield

Microsoft's most solid OS: http://www.geocities.com/rcwoolley/
Join the "Linux Support by Small Businesses" list at
http://mail.computerdatasafe.com.au/mailman/listinfo/lssb

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

Gregg C Levine

2002-11-26 01:59:43 UTC

Hello from Gregg C Levine
John, a technical issue to clarify. You mean that the thing was a
486DX33, and then all of the rest. Don't you? After all, I got started
in this mad business, officially, when those bloody things came out.
-------------------
Gregg C Levine hansolofalcon-XfrvlLN1Pqtfpb/***@public.gmane.org
------------------------------------------------------------
"The Force will be with you...Always." Obi-Wan Kenobi
"Use the Force, Luke." Obi-Wan Kenobi
(This company dedicates this E-Mail to General Obi-Wan Kenobi )
(This company dedicates this E-Mail to Master Yoda )

Post by John Bellomy
-----Original Message-----
Sent: Monday, November 25, 2002 8:46 PM
Subject: Re: [hercules-390] Re: Performance observations please

Post by Dan
I chose the custom installation path.

I did too, the first few times I installed it. It's a waste of time, just
choose a standard configuration that looks plausible.
Of course, when I fist installed RHL the target was a 486XD33, 8 Mbytes of RAM
and 170 Mbytes of disk. I had a web server with CGi support, Xfree, gcc & I
ran my mail service and SOCKS on it.
--
Cheers
John Summerfield
Microsoft's most solid OS: http://www.geocities.com/rcwoolley/
Join the "Linux Support by Small Businesses" list at
http://mail.computerdatasafe.com.au/mailman/listinfo/lssb

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

John Summerfield

2002-11-26 02:31:45 UTC

Post by Gregg C Levine
You mean that the thing was a
486DX33, and then all of the rest. Don't you?

yeah. Bought it to run OS/2. The alternatives, Win 3.1 and NT didn't look
promising.
--
Cheers
John Summerfield

Microsoft's most solid OS: http://www.geocities.com/rcwoolley/
Join the "Linux Support by Small Businesses" list at
http://mail.computerdatasafe.com.au/mailman/listinfo/lssb

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

Gregg C Levine

2002-11-26 02:43:21 UTC

Hello from Gregg C Levine
Can't say I blame you. I found programming for Win31 to be a bloody
nuisance as well. I never had the pleasure, or even aggravation of
working with NT, but okay. For OS/2? I'm afraid I wasn't able to try it
out fully.
-------------------
Gregg C Levine hansolofalcon-XfrvlLN1Pqtfpb/***@public.gmane.org
------------------------------------------------------------
"The Force will be with you...Always." Obi-Wan Kenobi
"Use the Force, Luke." Obi-Wan Kenobi
(This company dedicates this E-Mail to General Obi-Wan Kenobi )
(This company dedicates this E-Mail to Master Yoda )

Post by John Bellomy
-----Original Message-----
Sent: Monday, November 25, 2002 9:32 PM
Subject: Re: [hercules-390] Re: Performance observations please

Post by Gregg C Levine
You mean that the thing was a
486DX33, and then all of the rest. Don't you?

yeah. Bought it to run OS/2. The alternatives, Win 3.1 and NT didn't look
promising.
--
Cheers
John Summerfield
Microsoft's most solid OS: http://www.geocities.com/rcwoolley/
Join the "Linux Support by Small Businesses" list at
http://mail.computerdatasafe.com.au/mailman/listinfo/lssb
------------------------ Yahoo! Groups Sponsor
http://groups.yahoo.com/group/hercules-390
http://www.conmicro.cx/hercules
Your use of Yahoo! Groups is subject to

http://docs.yahoo.com/info/terms/
------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

John Summerfield

2002-11-26 01:43:01 UTC

Post by Gordon R. Keehn
Try running Win2K (the only flavor that remotely approaches Linux in

Off-topic, I know, but interesting:
<http://securityoffice.softimage.net/hotmail.html>
--
Cheers
John Summerfield

Microsoft's most solid OS: http://www.geocities.com/rcwoolley/
Join the "Linux Support by Small Businesses" list at
http://mail.computerdatasafe.com.au/mailman/listinfo/lssb

------------------------ Yahoo! Groups Sponsor ---------------------~-->
Get 128 Bit SSL Encryption!
http://us.click.yahoo.com/CBxunD/vN2EAA/xGHJAA/W4wwlB/TM
---------------------------------------------------------------------~->

90 Replies
9 Views
Permalink to this page
Disable enhanced parsing

Thread Navigation

Adam Thornton 2002-11-19 18:05:07 UTC

Joseph M. DeAngelo 2002-11-19 17:59:27 UTC

Dan 2002-11-19 19:18:37 UTC

John Summerfield 2002-11-20 02:27:11 UTC

Dan 2002-11-20 03:28:52 UTC

Greg Smith 2002-11-20 04:03:04 UTC

Dan 2002-11-20 20:26:13 UTC

John Summerfield 2002-11-20 04:24:36 UTC

Dan 2002-11-20 20:40:23 UTC

mvt 2002-11-21 15:06:27 UTC

Dan 2002-11-22 06:01:06 UTC

John Summerfield 2002-11-22 06:34:09 UTC

Dan 2002-11-22 20:12:15 UTC

Greg Smith 2002-11-22 07:26:07 UTC

Greg Smith 2002-11-22 08:11:06 UTC

John Summerfield 2002-11-22 08:14:20 UTC

floyds_void 2002-11-22 15:14:38 UTC

Dan 2002-11-22 22:06:39 UTC

Jay Maynard 2002-11-23 14:40:13 UTC

Dan 2002-11-25 19:40:06 UTC

s***@public.gmane.org 2002-11-25 20:47:22 UTC

Willem Konynenberg 2002-11-22 08:05:58 UTC

John Summerfield 2002-11-22 08:22:00 UTC

Willem Konynenberg 2002-11-22 08:57:10 UTC

John Summerfield 2002-11-22 09:07:28 UTC

Dan 2002-11-22 22:46:23 UTC

halfmeg 2002-11-22 23:58:28 UTC

John Bellomy 2002-11-23 04:11:55 UTC

Dan 2002-11-23 09:40:12 UTC

halfmeg 2002-11-23 23:50:33 UTC

Dan 2002-11-25 19:56:11 UTC

Fish 2002-11-25 20:56:22 UTC

Dan 2002-11-25 22:48:26 UTC

Fish 2002-11-26 03:35:41 UTC

jeffsavit 2002-11-26 15:21:30 UTC

Dan 2002-11-26 18:32:30 UTC

zapzap50 2002-11-26 19:27:36 UTC

jeffsavit 2002-11-26 22:26:09 UTC

s***@public.gmane.org 2002-11-26 21:05:02 UTC

Dan 2002-11-26 18:23:08 UTC

Willem Konynenberg 2002-11-26 00:26:32 UTC

Dan 2002-11-26 00:37:49 UTC

s***@public.gmane.org 2002-11-26 01:27:37 UTC

Dan 2002-11-26 01:52:08 UTC

Greg Smith 2002-11-26 00:38:52 UTC

John Alvord 2002-11-23 23:24:25 UTC

Fish 2002-11-23 23:49:57 UTC

s***@public.gmane.org 2002-11-24 00:07:24 UTC

John Alvord 2002-11-24 00:21:13 UTC

s***@public.gmane.org 2002-11-24 00:33:08 UTC

Dan 2002-11-25 20:11:04 UTC

John Alvord 2002-11-25 20:58:37 UTC

Dan 2002-11-25 19:49:20 UTC

mvt 2002-11-22 17:31:09 UTC

floyds_void 2002-11-22 18:35:20 UTC

Dan 2002-11-22 23:34:48 UTC

Dan 2002-11-22 23:27:41 UTC

John Alvord 2002-11-23 23:17:18 UTC

Jeffrey R. Broido 2002-11-24 14:53:14 UTC

Jeffrey R. Broido 2002-11-24 15:49:50 UTC

Cam Farnell 2002-11-24 16:23:18 UTC

John Summerfield 2002-11-20 02:31:40 UTC

tom balabanov 2002-11-20 04:26:05 UTC

Dan 2002-11-20 20:49:35 UTC

Greg Smith 2002-11-20 23:38:43 UTC

Dan 2002-11-21 02:33:56 UTC

Graham Goodwin 2002-11-24 23:45:50 UTC

John Alvord 2002-11-25 06:51:42 UTC

John Summerfield 2002-11-25 07:14:27 UTC

John Alvord 2002-11-25 07:17:08 UTC

John Summerfield 2002-11-25 08:57:29 UTC

Willem Konynenberg 2002-11-25 21:07:52 UTC

Dan 2002-11-25 21:24:38 UTC

mvt 2002-11-25 22:45:31 UTC

Dan 2002-11-25 23:45:46 UTC

Dan 2002-11-26 00:35:01 UTC

Richard Higson 2002-11-26 07:26:17 UTC

Fish 2002-11-28 12:32:39 UTC

s***@public.gmane.org 2002-11-25 22:27:22 UTC

Dan 2002-11-25 23:58:12 UTC

jeffsavit 2002-11-26 14:53:18 UTC

Gregg C Levine 2002-11-26 21:27:04 UTC

mvandere 2002-11-27 00:27:41 UTC

Gregg C Levine 2002-11-27 01:31:21 UTC

Gordon R. Keehn 2002-11-25 22:04:48 UTC

Dan 2002-11-25 23:03:18 UTC

John Summerfield 2002-11-26 01:46:23 UTC

Gregg C Levine 2002-11-26 01:59:43 UTC

John Summerfield 2002-11-26 02:31:45 UTC

Gregg C Levine 2002-11-26 02:43:21 UTC

John Summerfield 2002-11-26 01:43:01 UTC

about - legalese

Loading...