SMT: What utilization in real?

Simultaneous Multithreading (SMT) has been introduced to improve the throughput per core. In system Z if only one thread is running, the performance is very close to this of a single thread environment. In your data center you have most likely a mixture of both running. The FAQ I’m getting more and more is: how much capacity is left or how much is the real CPU SMT utilization?
The command from the hyptop post in LPAR does provide the data for answering this question. Here is an abbreviated output from this:

system  #core #The    core     the  mgm      Core+       thE+     Mgm+  online
(str)     (#)  (#)     (%)     (%)  (%)        (s)        (s)      (s)   (dhm)
T35LP76     8   16  799.20 1198.90 0.02     658.05     872.88     2.66 0:00:16

If we denote the core utilization by $u_c$ , the thread utilization by $u_t$ and the management utilization by $u_m$ then the real utilization $u_r$ can be calculated as follows:

$u_r = \frac{2u_c-u_t}{s} + (u_t-u_c) + u_m$

where $s$ denotes the SMT speedup factor. This factor is workload and hardware generation dependent. E.g. if a workload has to wait a lot for caches to be filled it’s probably benefitting more from SMT than a pure L1 bound compute workload. Hardware wise the efficiency improved quite a bit since z13. Not knowing anything about the workload I start with a rule of thumb of $s = 1.3$ for z15. In the example from above this would be

$u_r = \frac{2 \cdot 799.2\% - 1198.9\%}{1.3} + (1198.9\%-799.2\%) + 0.02\% = 707.03\%$

In total the SMT utilization would be a little more than seven IFLs in this example. The three terms are

The with the SMT speedup factor adjusted single thread component
The dual thread component
The management overhead component

If two threads would be running at the same time the first term would be zero. If all would be single threaded the second term would be zero.

Leave a Comment Cancel Reply

Login / RSS Feed