Intel® Math Kernel Library 11.0.2 User Guide

Improving Performance on Intel Xeon Phi Coprocessors

To improve performance of Intel MKL on Intel Xeon Phi coprocessors, use the following tips, which are specific to Intel MIC Architecture. You can also use general performance improvement recommendations, in Coding Techniques.

Memory Allocation

Performance of many Intel MKL routines improves when input and output data reside in memory allocated with 2M pages because this enables you to address more memory with less pages and thus reduce the overhead of translating between virtual and physical memory addresses compared to memory allocated with the default page size of 4K. For more information, refer to Intel® 64 and IA-32 Architectures Optimization Reference Manual and Intel® 64 and IA-32 Architectures Software Developer's Manual (connect to http://www.intel.com/ and enter the name of each document in the Find Content text box).

To allocate memory with 2M pages, you can use the mmap system call with the MAP_HUGETLB flag.

To enable allocation of memory with 2M pages for data of size exceeding 2M and transferred with offload pragmas, set the MIC_USE_2MB_BUFFERS environment variable to 2M. See Intel® Compiler User and Reference Guides for more details.

Specifying the maximum coprocessor memory that can be used for Automatic Offload computations typically enhances the performance because Intel MKL can reserve and keep the memory on the coprocessor during Automatic Offload computations. You can specify the maximum memory by setting the MKL_MIC_MAX_MEMORY environment variable to a value such as 2 GB.

OpenMP and Threading Settings

To improve performance of Intel MKL routines, use the following OpenMP and threading settings:

Data Alignment and Leading Dimensions

To improve performance of Intel MKL FFT functions, follow these recommendations:

For other Intel MKL function domains, use general recommendations for data alignment.

Optimization Notice

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804

See Also


Submit feedback on this help topic