diff options
author | Jonathan Kim <jkm@hdfgroup.org> | 2012-08-01 17:07:46 (GMT) |
---|---|---|
committer | Jonathan Kim <jkm@hdfgroup.org> | 2012-08-01 17:07:46 (GMT) |
commit | 99ef5765f5e25402d97f1355c0e049dc2b746228 (patch) | |
tree | 66c7d32bd1136473682e7fcdfe582226fe00bce9 | |
parent | 840ad091059877ed68fcc840f23ad633177c6f59 (diff) | |
download | hdf5-99ef5765f5e25402d97f1355c0e049dc2b746228.zip hdf5-99ef5765f5e25402d97f1355c0e049dc2b746228.tar.gz hdf5-99ef5765f5e25402d97f1355c0e049dc2b746228.tar.bz2 |
[svn-r22618]
Purpose:
HDFFV-8003 - ph5diff (parallel h5diff): daily test failure on ember intermittently during non comparable test file comparison
HDFFV-7755 - parallel h5diff : hanging on koala intermittently during non comparable test file comparison
Description:
non-comparable test intermittently hung on koala and ember, but not on jam. it didn't occur until -np reaches 4 or bigger. it occurred once out of many repeated attempts of the same test.
There was a incorrectly (mistakenly?) duplicated code in MPI section which caused such hang in a certain condition. The test used more processes than other tests, which increased chance to trigger more undone processes, and such process could enter the incorrect code section and wait for wrong pair of send. it explains why it occurred intermittently according to machine condition and using a certain feature.
Removed incorrect code which blocked correct code.
Tested: some manually repeated test performed
jam (linux32-LE), koala (linux64-LE), ostrich (linuxppc64-BE)
-rw-r--r-- | release_docs/RELEASE.txt | 4 | ||||
-rwxr-xr-x | tools/h5diff/testh5diff.sh | 9 | ||||
-rw-r--r-- | tools/lib/h5diff.c | 8 |
3 files changed, 5 insertions, 16 deletions
diff --git a/release_docs/RELEASE.txt b/release_docs/RELEASE.txt index a6b5d96..33599b1 100644 --- a/release_docs/RELEASE.txt +++ b/release_docs/RELEASE.txt @@ -705,6 +705,10 @@ Bug Fixes since HDF5-1.8.0 release Tools ----- + - ph5diff: Fixed intermittent hang issue on a certain operation in + parallel mode. It was detected by daily test for comparing + non-comparable objects, but it could have occurred in other + operations depend on machine condition. HDFFV-8003 (JKM 2012/08/01) - h5diff: Fixed test failure for "make check" due to failure of copying test files when performed in HDF5 source tree. Also applied to other tools. diff --git a/tools/h5diff/testh5diff.sh b/tools/h5diff/testh5diff.sh index 86a0c9d..7e95e80 100755 --- a/tools/h5diff/testh5diff.sh +++ b/tools/h5diff/testh5diff.sh @@ -836,14 +836,7 @@ TOOLTEST h5diff_221.txt -c non_comparables1.h5 non_comparables2.h5 /g2 # entire file # All the comparables should display differences. -if test -n "$pmode"; then - # parallel mode: - # skip due to ph5diff hangs on koala (linux64-LE) and ember intermittently. - # (HDFFV-8003 - TBD) - SKIP -c non_comparables1.h5 non_comparables2.h5 -else - TOOLTEST h5diff_222.txt -c non_comparables1.h5 non_comparables2.h5 -fi +TOOLTEST h5diff_222.txt -c non_comparables1.h5 non_comparables2.h5 # non-comparable test for common objects (same name) with different object types # (HDFFV-7644) diff --git a/tools/lib/h5diff.c b/tools/lib/h5diff.c index bcd63f1..0c1f3d3 100644 --- a/tools/lib/h5diff.c +++ b/tools/lib/h5diff.c @@ -1411,14 +1411,6 @@ hsize_t diff_match(hid_t file1_id, const char *grp1, trav_info_t *info1, options->not_cmp = options->not_cmp | nFoundbyWorker.not_cmp; busyTasks--; } /* end if */ - else if(Status.MPI_TAG == MPI_TAG_TOK_RETURN) - { - MPI_Recv(&nFoundbyWorker, sizeof(nFoundbyWorker), MPI_BYTE, Status.MPI_SOURCE, MPI_TAG_DONE, MPI_COMM_WORLD, &Status); - nfound += nFoundbyWorker.nfound; - options->not_cmp = options->not_cmp | nFoundbyWorker.not_cmp; - busyTasks--; - havePrintToken = 1; - } /* end else-if */ else if(Status.MPI_TAG == MPI_TAG_TOK_REQUEST) { MPI_Recv(NULL, 0, MPI_BYTE, Status.MPI_SOURCE, MPI_TAG_TOK_REQUEST, MPI_COMM_WORLD, &Status); |