summaryrefslogtreecommitdiffstats
path: root/release_docs
diff options
context:
space:
mode:
authorjhendersonHDF <jhenderson@hdfgroup.org>2022-01-22 14:40:33 (GMT)
committerGitHub <noreply@github.com>2022-01-22 14:40:33 (GMT)
commit99d3962a831167298ebc087f0b8e8b6209034d95 (patch)
tree5c879275551180b76d0b14be52cdbf0a1b98ad8c /release_docs
parentd45124d7085de2771c0157f5d48d71b21a10de1f (diff)
downloadhdf5-99d3962a831167298ebc087f0b8e8b6209034d95.zip
hdf5-99d3962a831167298ebc087f0b8e8b6209034d95.tar.gz
hdf5-99d3962a831167298ebc087f0b8e8b6209034d95.tar.bz2
Parallel rank0 deadlock fixes (#1183)
* Fix several places where rank 0 can skip past collective MPI operations on failure * Committing clang-format changes Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com>
Diffstat (limited to 'release_docs')
-rw-r--r--release_docs/RELEASE.txt12
1 files changed, 12 insertions, 0 deletions
diff --git a/release_docs/RELEASE.txt b/release_docs/RELEASE.txt
index d059fb3..bba27c9 100644
--- a/release_docs/RELEASE.txt
+++ b/release_docs/RELEASE.txt
@@ -1083,6 +1083,18 @@ Bug Fixes since HDF5-1.12.0 release
(DER - 2021/11/23, HDFFV-11286)
+ - Fixed several potential MPI deadlocks in library failure conditions
+
+ In the parallel library, there were several places where MPI rank 0
+ could end up skipping past collective MPI operations when some failure
+ occurs in rank 0-specific processing. This would lead to deadlocks
+ where rank 0 completes an operation while other ranks wait in the
+ collective operation. These places have been rewritten to have rank 0
+ push an error and try to cleanup after the failure, then continue to
+ participate in the collective operation to the best of its ability.
+
+ (JTH - 2021/11/09)
+
- Fixed an issue with collective metadata reads being permanently disabled
after a dataset chunk lookup operation. This would usually cause a
mismatched MPI_Bcast and MPI_ERR_TRUNCATE issue in the library for