Thursday, 19 November 2015

Intel Platform Shared Resource Monitoring and Cache Allocation Technology

The Intel Platform Shared Resource Monitoring features were introduced in the Intel Xeon E5v3 processor family. These new features provide a mechanism to measure platform shared resources, such as L3 cache occupancy via Cache Monitoring Technology (CMT) and memory bandwidth utilisation via Memory Bandwidth Monitoring (MBM).

Intel have written a Platform Quality of Service Tool (pqos) to use these monitoring features and I've packaged this up for Ubuntu 16.04 Xenial Xerus.

To install, use:

sudo apt-get install intel-cmt-cat

The tool requires access to the Intel MSRs, so one has to also install the msr module if it is not already loaded:

sudo modprobe msr

To see the Last Level Cache (llc) utilisation on a system, listing the most used first, use:

sudo pqos -T

pqos running on a 48 thread Xeon based server

The -p option allows one to specify specific monitoring events for specific process IDs. Event types can be Last Level Cache (llc), Local Memory Bandwidth (mbl) and Remote Memory Bandwidth (mbr).  For example, on a Xeon E5-2680 I have just Last Level Cache monitoring capability, so lets view the llc for stress-ng while running some VM stressor tests:

sudo pqos -T -p llc:$(pidof stress-ng | tr ' ' ',')

pqos showing equally shared cache between two stressor processes

Cache and Memory Bandwidth monitoring is especially useful to examine the impact of memory/cache hogging processes (such as VM instances).  pqos allows one to identify these processes simply and effectively.

Future Intel Xeon processors will provide capabilities to configure cache resources to specific classes of service using Intel Cache Allocation Technology (CAT).  The pqos tool allows one to modify the CAT settings, however, not having access to a CPU with these capabilities I was unable to experiment with this feature.  I refer you to the pqos manual for more details on this useful feature.  The beauty of CAT is that is allows one to tweak and fine tune the cache allocation for specific demanding use cases.  Given that the cache is a shared resource that can be impacted by badly behaving processes, the ability to tune the cache behaviour is potentially a big performance win.

For more details of these features, see the Intel 64 And IA-32 Architecture Software Development manual, section 17.15 "Platform Share Resource Monitoring: Cache Monitoring Technology" and 17.16 "Platform Shared Resource Control: Cache Allocation Technology".

4 comments:

  1. WARN: Cache allocation not supported on model name 'Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz'!
    PID 31956 monitoring start error,status 1

    running on Ubuntu 14.04.3 LTS (3.16.0-53-generic #72~14.04.1-Ubuntu SMP Fri Nov 6 18:17:23 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux)
    built from source (github)

    ReplyDelete
  2. Cool blog! Thanks for your interest in pqos technology.

    The first processor family broadly enabling CAT is Intel (R) Xeon (R) Processor D - see: http://ark.intel.com/products/family/87041/Intel-Xeon-Processor-D-Family#@All

    There is a couple of server products based on this processor on the market already and access to CAT enabled system should get easier over time.

    ReplyDelete
  3. Amazing. Can't wait to try this :)

    ReplyDelete
  4. Nice article, thanks for sharing.

    ReplyDelete