Summary of Debugging Tools for Parallel Applications

Nowadays it's not uncommon to run parallel applications with hundreds of thousands of processes on supercomputing platforms. Debugging these parallel applications with sporadic crashes, deadlocks, memory errors or incorrect results is a challenging task. There are number of tools available that help identifying and fixing bugs but one needs to understand tools, their capabilities and when they can be used. This post tries to summarise various debugging tools (open source as well as commercial).

Note that not all tools can be used with distributed applications. For example, open source tools like GDB and Valgrind are commonly used for debugging serial, multi-threaded applications.… Read the rest

Summary of Profiling Tools for Parallel Applications

Many scientific/industrial applications run on workstation to largest supercomputers in the world. With the continuous evolution of hardware platforms, achieving good performance is a challenging task. There are many profiling tools available to analyse and optimise the performance. But not all tools/methods are available on every platform, especially in high performance computing. First step in performance engineering workflow is to understand which tools are available and when they can be used. There is no one-size-fits-all solution : some are designed with broad feature list for high level analysis and others for specific platform with low level hardware metrics.

While choosing profiling tool one need to consider different aspects:

  • Goal : Are you interested in high level performance metrics?
Read the rest


I have great interest in learning, understanding and experimenting with performance tools. After scattered notes and wiki pages over the years, I was planning to write this blog for long time. And finally trying to put all this together!

Over the years I have used number of performance analysis and optimisation tools for supercomputing platforms. I have been working with different scientific applications written in C, C++, Fortran, Python, Matlab, Octave and Java. Often these applications run on workstations, small clusters and largest supercomputers in the world. This gives me an opportunity to work with different systems including Intel/IBM/ARM CPUs, NVIDIA/AMD GPUs, Intel MIC, IBM BlueGene, Fujitsu and Cray.… Read the rest