diff options
author | jhendersonHDF <jhenderson@hdfgroup.org> | 2023-10-24 02:06:44 (GMT) |
---|---|---|
committer | GitHub <noreply@github.com> | 2023-10-24 02:06:44 (GMT) |
commit | af49eb5b8647e8d9ffb527fd533def0910eb535c (patch) | |
tree | f4f368a972fd547b3a04b3c9682758c148c3dcff /release_docs | |
parent | ceb03358a1d713078ae36bfff07be62b433d970a (diff) | |
download | hdf5-af49eb5b8647e8d9ffb527fd533def0910eb535c.zip hdf5-af49eb5b8647e8d9ffb527fd533def0910eb535c.tar.gz hdf5-af49eb5b8647e8d9ffb527fd533def0910eb535c.tar.bz2 |
Fix hangs during collective I/O with independent metadata writes (#3693)
Diffstat (limited to 'release_docs')
-rw-r--r-- | release_docs/RELEASE.txt | 19 |
1 files changed, 19 insertions, 0 deletions
diff --git a/release_docs/RELEASE.txt b/release_docs/RELEASE.txt index 222c277..e5d53e4 100644 --- a/release_docs/RELEASE.txt +++ b/release_docs/RELEASE.txt @@ -392,6 +392,25 @@ Bug Fixes since HDF5-1.14.0 release =================================== Library ------- + - Fixed potential hangs in parallel library during collective I/O with + independent metadata writes + + When performing collective parallel writes to a dataset where metadata + writes are requested as (or left as the default setting of) independent, + hangs could potentially occur during metadata cache sync points. This + was due to incorrect management of the internal state tracking whether + an I/O operation should be collective or not, causing the library to + attempt collective writes of metadata when they were meant to be + independent writes. During the metadata cache sync points, if the number + of cache entries being flushed was a multiple of the number of MPI ranks + in the MPI communicator used to access the HDF5 file, an equal amount of + collective MPI I/O calls were made and the dataset write call would be + successful. However, when the number of cache entries being flushed was + NOT a multiple of the number of MPI ranks, the ranks with more entries + than others would get stuck in an MPI_File_set_view call, while other + ranks would get stuck in a post-write MPI_Barrier call. This issue has + been fixed by correctly switching to independent I/O temporarily when + writing metadata independently during collective dataset I/O. - Dropped support for MPI-2 |