conceptual-twisted-arrowsCRESTA boasts an impressive range of European partner tools for debugging and performance analysis. This includes Allinea’s DDT debugger, KTH’s perfminer and TUD’s Vampir tool-suite and  MUST runtime error detection tool (developed in collaboration with LLNL and ASC Trt-Labs).

Performance analysis tools. With the increasing complexity and parallelism of HPC systems, it becomes more and more challenging to understand the runtime behaviour of applications. Particularly, to reach the exascale, complete attention will need to be given to all areas that can limit scalability. Analysing the performance of exascale applications and more specifically detecting performance bottlenecks is, however, a challenging task. This process is only manageable with appropriate software tools. They need to cover all relevant aspects of exascale systems and need to provide sufficient scalability themselves. Developed by one of the partners (TUD), the leading TUD developed performance analysis tools Vampir was one of CRESTA’s focuses.

Debugging tools. In terms of debugging, members of the consortium develop the tools DDT – a parallel debugger from Allinea – and MUST – a runtime error detection tool developed by TUD in cooperation with Lawrence Livermore National Laboratory and the ASC Tri-Labs. Debugging tools are an important requirement to cope with the complexity of parallel systems in general and with future exascale systems in particular. However Exascale programming environments present challenges to debugging from many perspectives: the sheer quantity of program threads/processes, the programming models (e.g. MPI, PGAS, Hybrid), and potentially adaptive behaviour of fault tolerance strategies. Whilst debuggers have been shown to be able to cope with petascale – Allinea’s DDT proving the feasibility of petascale debugging for the existing PRACE prototypes and some larger systems by reaching 220,000 cores – there are open issues for the future. CRESTA’s work allowed for an early deployment of debugging tools and greatly simplify the development of exascale applications.