Optimizing the data pipeline: the RAMSES code use case for astrophysics

  • Intervenant : Loïc Strafella ()
  • Date : le 19-01-2024 à partir de 14h00
  • Lieu : Salle Jean Lascoux (CPHT, aile0, 1er étage)

Résumé de l'exposé

The European Exascale Software Initiative (EESI) highlighted, a decade ago, the technical challenges posed by the scalability of codes with the emergence of future incoming Exascale machines. Among the numerous issues discussed, data-related challenges stand out. The anticipated volume of data to be generated by future numerical simulations is expected to become so substantial that storage, transfer, and data analysis pose overarching technical challenges. We will present the case of a simulation code, massively parallel and widely used in astrophysics: RAMSES, which faced scalability issues related to data. In the pursuit of conducting a state-of-the-art large-scale simulation to study the regulation of stellar formation in the Milky Way, it was necessary, initially, to implement technical solutions to optimize the entire data pipeline from the code's input/output to analysis methods. We will illustrate that through a consideration of "stream separation", the data models used, and compres- sion algorithms, this simulation became feasible. The gains achieved through the detailed approaches amount to a 20-fold improvement in data read and write times and a 24-fold reduction in the produced data volumes. To delve deeper, we will discuss the significance of "in situ" and "in transit" processing approaches, as highlighted in the EESI recommendations for the transition to Exascale computing.

Ajouter l'événement à l'agenda (ics)
Haut