Mailing List picongpu-users@hzdr.de Message #280
From: Axel Huebl a.huebl@hzdr.de <picongpu-users@hzdr.de>
Subject: OpenMPI: Use ROMIO for IO
Date: Mon, 21 Jan 2019 12:26:38 +0100
To: <picongpu-users@hzdr.de>
Dear Users,


we are relying on several dependencies of our community to develop PIConGPU. As you might know, one of those dependencies is "MPI" which is used for multi-node message passing in HPC and implemented independently in projects such as MPICH or OpenMPI.

MPI is also used for I/O operations and as such used as a dependency of HDF5 and ADIOS which we use in plugins for parallel I/O. Unfortunately, (all) recent releases of the OpenMPI implementation have an issue that you might want to mitigate.

OpenMPI's default for its IO backend is OMPIO, starting with OpenMPI 2.x.

Unfortunately, that backend contains severe bugs leading to data corruption and sporadic crashes (as of the latest releases 3.1.3 and 4.0.0). This is most visible with our parallel HDF5 plugin, but ADIOS is potentially affected as well. Please see https://github.com/open-mpi/ompi/issues/6285 for details.

For all system templates (`.tpl` files for `tbg`) that rely on OpenMPI (and its derivatives, such as BullMPI), we now disable the "OMPIO" default IO backend and fallback to the existing ROMIO backend for MPI-I/O until bugfix releases are shipped.

  https://github.com/ComputationalRadiationPhysics/picongpu/pull/2857

Please apply those changes manually already in your `etc/picongpu/<system-name>/<queue>.tpl`. We recommend to mitigate this issue already since the data corruption this causes might go unnoticed even if you don't see crashes.

Other MPI implementations such as MPICH, and [MPICH-based flavors](https://www.mpich.org/about/collaborators/) such as IntelMPI, use ROMIO by default (they develop ROMIO) and are not affected.



Best regards,
Axel
--

Axel Huebl
Phone: +49 351 260 3582
Institute of Radiation Physics
http://www.hzdr.de/crp
Helmholtz-Zentrum Dresden - Rossendorf (HZDR)
Bautzner Landstr. 400 | 01328 Dresden | Germany
Board of Directors:
Prof. Dr. Dr. h. c. Roland Sauerbrey, Dr. Ulrich Breuer
Company Registration Number VR 1693, Amtsgericht Dresden
Subscribe (FEED) Subscribe (DIGEST) Subscribe (INDEX) Unsubscribe Mail to Listmaster