Intel® Math Kernel Library 11.0.2 User Guide
Intel MKL supports Intel® Xeon Phi™ coprocessors in three modes:
Native
Offload
Automatic Offload
Hybrid offload
Compiler Assisted Offload
For details of these modes, see to Running Intel MKL on Intel MIC Architecture Coprocessors in Native Mode.
Native mode is required to run MPI processes directly on the Intel Xeon Phi coprocessors. If the MPI processes run solely on the Intel® Xeon® processors, the coprocessors are used in one of the offload modes.
In many cases, the Intel Xeon Host processor has more memory than the Intel Xeon Phi coprocessor. Therefore, the MPI processes have access to more memory when run on the Hosts as opposed to the coprocessors.
HPL is a homogeneous code by nature, which means that it requires that each MPI process is run in an environment with similar CPU and memory constraints. If for some reason, one node is twice as powerful as another node, you might balance this by running two MPI processes on the faster node. You can balance for either performance or for memory, and the balancing process is up to you.
Because of this homogeneous behavior of the standard MP LINPACK benchmark, avoid using MPI processes on the Intel Xeon Phi coprocessors. The Host (or Hosts) probably has more memory than the coprocessors do, and it is difficult to achieve high performance because the problem sizes are bounded by the memory of the coprocessors that the MPI processes can access. Use an offload method instead.
To maximize performance, increase the memory on the Host(s) (64 GB per coprocessor is ideal) and run a large problem offloading pieces of work to the coprocessors. Although this method increases the PCIe bus traffic, it is worthwhile for solving a problem that is large enough.
If the amount of memory on the Hosts is small, you might get the best performance by running natively instead of offloading.
|
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice. Notice revision #20110804 |