Over the years I have used different performance analysis and optimisation tools for High Performance Computing platforms. I work with different scientific applications written in C, C++, Fortran, Python, Matlab, Octave and Java. Often these applications run on workstations, small clusters and largest supercomputers in the world. This gives me an opportunity to work with different systems including Intel/IBM/ARM CPUs, NVIDIA/AMD GPUs, Intel MIC, IBM BlueGene, Fujitsu and Cray.

I have great interest in learning, understanding and experimenting with performance tools. After scattered notes and wiki pages over the years, I was planning to write this blog for long time. And finally trying to put all this together!

This blog is part of my learning process and hence everything may not be perfect. If you find this useful, thanks to brilliant people developing tools, tutorials from where I am borrowing this information. If there are any inaccuracies or missing details, it’s because I am still learning.

If you have any question or suggestion or want to discuss specific aspects, I will be happy to hear!