During my talk at the parallel 2015 conference i was asked how one can measure traffic on the PCI express bus. For multi GPU computing it is very important to control the amount of data exchanged on the PCIe bus.
You need the Intel Performance Counter Monitor.
Compile it and copy
pcm-pcie.exe into a new directory.
Then read this helpful article of how to obtain the
WinRing-Dlls and drivers. Copy them into the same directory, start the
cmd.exe as an admin and there you go.
Now you can analyse the traffic on the PCI bus.
C:\\IntelPerformanceCounterMonitorV2.8\\bin> pcm-pcie.exe DEBUG: Setting Ctrl+C done. Intel(r) Performance Counter Monitor: PCIe Bandwidth Monitoring Utility Copyright (c) 2013-2014 Intel Corporation This utility measures PCIe bandwidth in real-time PCIe event definitions (each event counts as a transfer): PCIe read events (PCI devices reading from memory - application writes to disk/network/PCIe device): PCIePRd - PCIe UC read transfer (partial cache line) PCIeRdCur* - PCIe read current transfer (full cache line) On Haswell Server PCIeRdCur counts both full/partial cache lines RFO* - Demand Data RFO CRd* - Demand Code Read DRd - Demand Data Read PCIeNSWr - PCIe Non-snoop write transfer (partial cache line) PRd - MMIO Read [Haswell Server only: PL verify this on IVT] (Partial Cache Line) PCIe write events (PCI devices writing to memory - application reads from disk/network/PCIe device): PCIeWiLF - PCIe Write transfer (non-allocating) (full cache line) PCIeItoM - PCIe Write transfer (allocating) (full cache line) PCIeNSWr - PCIe Non-snoop write transfer (partial cache line) PCIeNSWrF - PCIe Non-snoop write transfer (full cache line) ItoM - PCIe write full cache line RFO - PCIe parial Write WiL - MMIO Write (Full/Partial) * - NOTE: Depending on the configuration of your BIOS, this tool may report '0' if the message has not been selected. Starting MSR service failed with error 2 The system cannot find the file specified. Trying to load winring0.dll/winring0.sys driver... Using winring0.dll/winring0.sys driver. Number of physical cores: 6 Number of logical cores: 12 Number of online logical cores: 12 Threads (logical cores) per physical core: 2 Num sockets: 1 Physical cores per socket: 6 Core PMU (perfmon) version: 3 Number of core PMU generic (programmable) counters: 4 Width of generic (programmable) counters: 48 bits Number of core PMU fixed counters: 3 Width of fixed counters: 48 bits Nominal core frequency: 3500000000 Hz Package thermal spec power: 140 Watt; Package minimum power: 47 Watt; Package maximum power: 0 Watt; 2 memory controllers detected with total number of 5 channels. Detected Intel(R) Core(TM) i7-5930K CPU @ 3.50GHz "Intel(r) microarchitecture codename Haswell-EP/EN/EX" Update every 1 seconds delay_ms: 84 Skt | PCIeRdCur | RFO | CRd | DRd | ItoM | PRd | WiL 0 4236 576 5980 K 1536 K 48 3456 7116 ———————————————————————– * 4236 576 5980 K 1536 K 48 3456 7116
Remark: This post was adapted to the new blog format in November 2016.