Tuning and Optimization for HPC
Compiler
Node Level Optimization
- see: Intel Software Documentation Library
 - Quick-Reference Guide to Optimization with Intel® Compilers
 - Tutorial: Using Auto Vectorization
 - A Guide to Vectorization with Intel® C++ Compilers
 - Guide to Auto-Vectorization
 - Requirements for Vectorizable Loops
 - Tutorial: Finding Hotspots - Fortran Sample Application, Linux*
 - Get Started with Intel® VTune™ Amplifier
 - Developing Multithreaded Applications: A Platform Consistent Approach
 - Measuring and Understanding Memory Bandwidth
 
MPI
OpenMP
Improving OpenMP Scaling
- OpenMP home page. The central source of information about OpenMP
 - LRZ/RRZE Courses: OpenMP_2day_course.pdf
 
Tools
- Intel Performance Tools
 - Information on Hardware and Topology
 - Timing and Profiling
 - Hardware Perfomance Counters
 - MPI, OpenMP, Parallelization, Vectorization, SIMD Analysis
 - Memory Leaks