Peter Hunkeler
2018-06-07 07:27:01 UTC
There are some statements around zIIP utilization which I read here and there. Statements like:
- "You should not utilize one zIIP more than 30%, two zIIPs more than 60%..."
- "A task may become delayed for up to 3.2 ms (actually ZIIPAWMT) before the busy zIIP asks for help from a CP".
For this discussion, lets assume equal speed CPs and zIIPs, and a reasonable CP to zIIP ratio, and more than on processor of each kind.
It has been a long time strength of IBM Z (and all the predecessors) that the CPs in an LPAR can be utilized way above 90% without major problems arising. I seem to understand that this has changed lately, but still some 85% (?) should be fine.
Now, all work running on zIIPs was once work running on CPs (and still is if there are no zIIPs). So the work is no different (apart from much being run under an SRB instead of a TCB), and the response time requirement is no different. Right?
If so, how comes that busy zIIPs are said to be more of a problem than busy CPs? If the work can accept some queueing when run on CPs, why not when run on zIIPs. Queueing theory should apply equally to both.
When a processor is busy 50%, then 50% of the time there is at least one ready task, the one executing. Maybe there are some more waiting on the work queue. But these 50% say nothing about the delay of the tasks on the work queue.
In a simplified case, assume 5 tasks with equal priority, each one quickly, say after 0.5 ms, coming to the point where it has to give up the processor for a very short period of time before being requeued on the work queue. They all constantly work that way for 30 seconds in row, then become undispatchable for the remaining 30 seconds of that 50% busy minute. During the first 30 seconds, the zIIP is 100% busy, and after 3.2ms (ZIIPAWMT), the zIIP will ask a CP for help.
None of the tasks has been delayed by 3.2ms, although the ZIIP recognized its work queue has not become empty for 3.2ms and asked for help. To the contrary, the work has gotten better service because two processors are now serving the single work queue. (Again for simplicity, not currently taking priorities into account).
Same case but the task are working 1ms each time. Now it always takes more than 3.2ms for the last task on the work queue before it is being redispatched as long as the zIIP has not asked for help. But the zIIP will ask for help after 3.2ms, and the delay for the tasks will shrink.
Isn't this a better situation for zIIP work than for non-zIIP work? Same scenario on CPs. There is no-one to help.
Any thoughts?
- "You should not utilize one zIIP more than 30%, two zIIPs more than 60%..."
- "A task may become delayed for up to 3.2 ms (actually ZIIPAWMT) before the busy zIIP asks for help from a CP".
For this discussion, lets assume equal speed CPs and zIIPs, and a reasonable CP to zIIP ratio, and more than on processor of each kind.
It has been a long time strength of IBM Z (and all the predecessors) that the CPs in an LPAR can be utilized way above 90% without major problems arising. I seem to understand that this has changed lately, but still some 85% (?) should be fine.
Now, all work running on zIIPs was once work running on CPs (and still is if there are no zIIPs). So the work is no different (apart from much being run under an SRB instead of a TCB), and the response time requirement is no different. Right?
If so, how comes that busy zIIPs are said to be more of a problem than busy CPs? If the work can accept some queueing when run on CPs, why not when run on zIIPs. Queueing theory should apply equally to both.
When a processor is busy 50%, then 50% of the time there is at least one ready task, the one executing. Maybe there are some more waiting on the work queue. But these 50% say nothing about the delay of the tasks on the work queue.
In a simplified case, assume 5 tasks with equal priority, each one quickly, say after 0.5 ms, coming to the point where it has to give up the processor for a very short period of time before being requeued on the work queue. They all constantly work that way for 30 seconds in row, then become undispatchable for the remaining 30 seconds of that 50% busy minute. During the first 30 seconds, the zIIP is 100% busy, and after 3.2ms (ZIIPAWMT), the zIIP will ask a CP for help.
None of the tasks has been delayed by 3.2ms, although the ZIIP recognized its work queue has not become empty for 3.2ms and asked for help. To the contrary, the work has gotten better service because two processors are now serving the single work queue. (Again for simplicity, not currently taking priorities into account).
Same case but the task are working 1ms each time. Now it always takes more than 3.2ms for the last task on the work queue before it is being redispatched as long as the zIIP has not asked for help. But the zIIP will ask for help after 3.2ms, and the delay for the tasks will shrink.
Isn't this a better situation for zIIP work than for non-zIIP work? Same scenario on CPs. There is no-one to help.
Any thoughts?
--
Peter Hunkeler
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN
Peter Hunkeler
----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to ***@listserv.ua.edu with the message: INFO IBM-MAIN