| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
| |
Clean up code, remove unused variables, remove "naked" printf()s, make
h5recover test script work in 'srcdir' build, etc. (The h5recover tests are
failing still and the script prints "PASSED" when it doesn't, but we'll work on
that more next)
Tested on:
Mac OS X/32 10.5.6 (amazon)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
recognition
and rejection of attempts to apply the wrong journal file to a corrupt
HDF5 file. Specifically, I made the following changes:
1) Moved all journaling data into the journaling in progress superblock
extension message.
2) Added a "magic number" to the journaling in progress message, with
the same "magic number being added to the header of the associated
journal file.
3) Modifications to library test code to support the above.
4) Modified h5recover to examine the supplied hdf5 file, determine if
it is in fact a HDF5 file, if so determine if it is marked as having
journaling in progress, and if it does, extract the contents of the
journaling in progress super block extension message.
5) Modified h5recover to examine the supplied journal file, determine
if it is in fact a HDF5 journal file, and if so, extract the data
from its header.
6) Modified h5recover to refuse to apply the supplied journal file to
the supplied HDF5 file unless the "magic numbers" obtained from these
files matches.
7) Added an examine option to h5recover that causes it to examine and
report on the supplied files, but do nothing. This option exists
primarily to facilitate testing, but I expect that some users will
find it useful as well.
8) Added test code to exercise items 4-7. Note that while I have tried
to cover the more likely cases, this test code is extremely cursory.
In particular, the code to examine the supplied HDF5 file is barely
tested at all. Need a library of HDF5 files exibiting the full range
of possible super block and super block extension message structures
to test this properly.
9) In passing, tighened up the code that controls dumps of "possibly
significant" differences between the contents of the control and
recovered data sets in the h5recover. It should now ignore one
integer matches in what appears to be garbage raw data.
Tested: serial and parallel on Phoenix
serial and parallel on Jam
serial on Linew
serial on Liberty
All tests were done in debug mode.
|
|
|
|
|
| |
Tested only on Phoenix (serial Linux AMD64) as changes were minor, and
any errors should be caught in the next checkin.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
driver in
metadata cache tests.
1) Fixed the core file driver failures previously observed on some
platforms when running the cache2 tests with the core file driver
enabled. This was done by allocating all needed memory on file open.
2) Added code to cache2.c to allow the use of the core file driver to
be forced via the HDF5_DRIVER environment variable.
3) Added code to try to figure out whether using the core file driver
in cache2 makes sense, and then use it or not as seems appropriate
unless overridden via the HDF5_DRIVER environment variable.
This code only works under Linux and BSD (including MacOS). For now
at least, we use regular files in all other cases unless directed
otherwise.
Note that this required a fair bit of configuration code massage.
4) Updated Makefile.am in examples to run the new mdj_api_example.c
example. Forgot to "svn add mdj_api_example.c" before this checkin,
but will do so shortly.
Tested on:
Duty (serial), Liberty (serial), tejeda (serial), jam (serial and
parallel), and Phoenix (serial and parallel). Note that Phoenix is
now 64 bit AMD64 Linux.
|
|
|
|
|
|
|
|
|
|
|
| |
1) Fix for assertion failure mentioned in my last checkin. Thanks to
Quincey for the fix.
2) Added tools/h5recover/trecover_verifier.c -- forgot to do this
in the last checkin.
Tested on Phoenix (serial), Linew (serial) and Jam (parallel).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1) In H5F.c, modified several metadata cache related routines to
talk to the new cache, not the pre-journaling cache.
2) Enabled the API smoke check in cache2_journal.
3) Wrote a example of use of the journaling API and included
it as a test in cache2_journal.c. Some re-factoring of the
cache2 test code to move supporting macros and functions
out of cache2_api.c and into cache2_common.c & .h.
4) Modified tools/h5recover/trecover to include a new verify
option, which is intended to verify that a file has been
correctly recovered via h5recover.
The basic idea of this feature is to look at the data sets
in the architype and recovered files, and verify that the metadata
in the versions in the recovered file (if they appear at all), agree
with the architype versions where it must, and contain plausible
values where it is possible that changes were lost.
The test also looks at the raw data, and dumps the architype and
recovered versions to stdout if anything looks "odd".
At present, my addition only works with the integer chunked
data set -- not with the other data set types that can be
created by trecover.
The code for the verify function is in the new file
trecover_verifier.c. Updated Makefile.am and the manifest
accordingly.
5) Modified the synchronous crash test in tools/h5recover to
to function when return codes are not passed back to the
calling script, and to use the above modifications to trecover
to examine the recovered file, instead of comparing the output
of dumps of the architype and recovered files.
6) Commented out the asynchronous crash test in tools/h5recover,
as the functionality of that test is now handled in the
"walking crash" test.
7) Modified the "walking crash" test to use the trecover modifications
to verify each recovery.
8) Modified the journaling file marking tests to function when the
the return code is not passed back to the calling script.
9) Commented out the "tgroup-1.ls 1 -w80 -r -g tgroup.h5" in
tools/h5ls. I am given to understand that this test was
failing on redstorm due to yod's failure to pass back return
codes. I have not investigated this personally.
10) Updated bin/reconfigure to deal with recent changes in the file
system structure on jam.
Testing:
Tested (serial) on Phoenix, Linew, and RSQ -- all pass. Note that
on the "walking crash" test in tools/h5recover, I was unable to
set the asynchronous crash delay small enough to get the crash to
occur before trecover completed (I got down to 1 usec). This
was not a problem on on redstorm the last time we tried testing
there, so I'm not too worried about it.
I also did a parallel test on jam -- this test failed with an assertion
failure in dtypes -- output follows:
============================
dtypes Test Log
============================
Testing non-aligned conversions (ALIGNMENT=1)....
Testing H5Tget_class() PASSED
Testing H5Tcopy() PASSED
Testing H5Tdetect_class() PASSED
Testing compound datatypes PASSED
Testing query functions of compound and enumeration types PASSED
Testing transient datatypes PASSED
Testing named datatypes PASSED
Testing functions of encoding and decoding datatypes PASSED
Testing encoding datatypes with the 'use the latest format' flag PASSED
Testing exceptions for int <-> float conversions PASSED
Testing deprected API routines for datatypes PASSED
Testing string conversions PASSED
Testing random string conversion speed PASSED
Testing some type functions for string PASSED
Testing compound element reordering PASSED
Testing compound subset conversions PASSED
Testing compound element shrinking & reordering PASSED
Testing optimized struct converter PASSED
Testing compound element growing PASSED
Testing compound element insertion PASSED
Testing packing compound datatypes PASSED
Testing compound datatype with VL string dtypes: H5FD.c:2150: H5FD_write: Assertion `1==H5P_isa_class(dxpl_id,(H5P_CLS_DATASET_XFER_g))' failed.
Command terminated by signal 6
0.20user 0.10system 0:00.39elapsed 75%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (4major+10263minor)pagefaults 0swaps
I'm checking in anyway, as this looks unrelated to any of my recent
changes. Quincey and I should get together about this one.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Modified H5C2_journal_post_flush() write the super block and flush
the file before truncating the journal. Failure to do this opened
a window in which the application could crash leaving the HDF5 file
in a state that was un-recoverable.
The hope is that this will fix the file recovery bug observed on
RSQ -- but I have not been able to test there. However, I was able
to generate a similar bug on Linew, and this fix seems to deal with
the Linew bug.
Added a third test to the h5recovery tests. This is really
a test for the library, but it was easier to use existing test
code there to construct the new test.
The new test runs the same application repeatedly, but setting a
timer to crash the application at progressively later times. The object
is to search for windows in which the application leaves the HDF5 file
in an un-recoverable state.
Also, updated H5recover.c to use HDstrtoll() instead of HDstrtod()
to read some addresses and such from the journal file.
Tested serial (debug) on Phoenix and Linew, and parallel (debug)
on Jam.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bring revisions 15289:15457 from trunk into metadata journaling
branch.
Tested on:
FreeBSD/32 6.2 (duty) in debug mode
FreeBSD/64 6.2 (liberty) w/C++ & FORTRAN, in debug mode
Linux/32 2.6 (kagiso) w/PGI compilers, w/C++ & FORTRAN, w/threadsafe,
in debug mode
Linux/64-amd64 2.6 (smirom) w/default API=1.6.x, w/C++ & FORTRAN,
in production mode
Linux/64-ia64 2.6 (cobalt) w/Intel compilers, w/C++ & FORTRAN,
in production mode
Solaris/32 2.10 (linew) w/deprecated symbols disabled, w/C++ & FORTRAN,
w/szip filter, in production mode
Mac OS X/32 10.5.2 (amazon) in debug mode
Linux/64-ia64 2.4 (tg-login3) w/parallel, w/FORTRAN, in production mode
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Review and get code to conform to standard library coding style, also
add 'const' keyword to "set" routines.
Tested on:
FreeBSD/32 6.2 (duty) in debug mode
FreeBSD/64 6.2 (liberty) w/C++ & FORTRAN, in debug mode
Linux/32 2.6 (kagiso) w/PGI compilers, w/C++ & FORTRAN, w/threadsafe,
in debug mode
Linux/64-amd64 2.6 (smirom) w/default API=1.6.x, w/C++ & FORTRAN,
in production mode
Linux/64-ia64 2.6 (cobalt) w/Intel compilers, w/C++ & FORTRAN,
in production mode
Solaris/32 2.10 (linew) w/deprecated symbols disabled, w/C++ & FORTRAN,
w/szip filter, in production mode
Mac OS X/32 10.5.2 (amazon) in debug mode
Linux/64-ia64 2.4 (tg-login3) w/parallel, w/FORTRAN, in production mode
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove remnant of initial H5Pset_journal() routine
Tested on:
FreeBSD/32 6.2 (duty) in debug mode
FreeBSD/64 6.2 (liberty) w/C++ & FORTRAN, in debug mode
Linux/32 2.6 (kagiso) w/PGI compilers, w/C++ & FORTRAN, w/threadsafe,
in debug mode
Linux/64-amd64 2.6 (smirom) w/default API=1.6.x, w/C++ & FORTRAN,
in production mode
Linux/64-ia64 2.6 (cobalt) w/Intel compilers, w/C++ & FORTRAN,
in production mode
Solaris/32 2.10 (linew) w/deprecated symbols disabled, w/C++ & FORTRAN,
w/szip filter, in production mode
Mac OS X/32 10.5.2 (amazon) in debug mode
Linux/64-ia64 2.4 (tg-login3) w/parallel, w/FORTRAN, in production mode
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add return value to metadata journaling status change callback routines,
so we can detect errors in them.
Tested on:
FreeBSD/32 6.2 (duty) in debug mode
FreeBSD/64 6.2 (liberty) w/C++ & FORTRAN, in debug mode
Linux/32 2.6 (kagiso) w/PGI compilers, w/C++ & FORTRAN, w/threadsafe,
in debug mode
Linux/64-amd64 2.6 (smirom) w/default API=1.6.x, w/C++ & FORTRAN,
in production mode
Linux/64-ia64 2.6 (cobalt) w/Intel compilers, w/C++ & FORTRAN,
in production mode
Solaris/32 2.10 (linew) w/deprecated symbols disabled, w/C++ & FORTRAN,
w/szip filter, in production mode
Mac OS X/32 10.5.2 (amazon) in debug mode
Linux/64-ia64 2.4 (tg-login3) w/parallel, w/FORTRAN, in production mode
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use metadata journaling callback to allow dataset code to track journal
status changes and flush cached info appropriately.
Tested on:
FreeBSD/32 6.2 (duty) in debug mode
FreeBSD/64 6.2 (liberty) w/C++ & FORTRAN, in debug mode
Linux/32 2.6 (kagiso) w/PGI compilers, w/C++ & FORTRAN, w/threadsafe,
in debug mode
Linux/64-amd64 2.6 (smirom) w/default API=1.6.x, w/C++ & FORTRAN,
in production mode
Linux/64-ia64 2.6 (cobalt) w/Intel compilers, w/C++ & FORTRAN,
in production mode
Solaris/32 2.10 (linew) w/deprecated symbols disabled, w/C++ & FORTRAN,
w/szip filter, in production mode
Mac OS X/32 10.5.2 (amazon) in debug mode
Linux/64-ia64 2.4 (tg-login3) w/parallel, w/FORTRAN, in production mode
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Convert local heap cache client to use metadata journaling cache.
Other minor cleanups & simplifications, etc.
Tested on:
FreeBSD/32 6.2 (duty) in debug mode
FreeBSD/64 6.2 (liberty) w/C++ & FORTRAN, in debug mode
Linux/32 2.6 (kagiso) w/PGI compilers, w/C++ & FORTRAN, w/threadsafe,
in debug mode
Linux/64-amd64 2.6 (smirom) w/default API=1.6.x, w/C++ & FORTRAN,
in production mode
Linux/64-ia64 2.6 (cobalt) w/Intel compilers, w/C++ & FORTRAN,
in production mode
Solaris/32 2.10 (linew) w/deprecated symbols disabled, w/C++ & FORTRAN,
w/szip filter, in production mode
Mac OS X/32 10.5.2 (amazon) in debug mode
Linux/64-ia64 2.4 (tg-login3) w/parallel, w/FORTRAN, in production mode
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
journaling in progress to avoid calls to fork(). Did this by creating
cache2_jnl_file_marking (a program that sets up or checks the results
of the specified file marking test depending on parameters passed to
it) and test/testjnlfilemarking.sh (a shell script to call
cache2_jnl_file_marking and report results).
Also fixed an input validation bug in src/H5AC.c in passing.
Tested on Phoenix (serial -- debug and production mode)
Kagiso (parallel)
Linew (serial)
There was another checkin during these tests. As the changes looked
orthoginal to mine, I updated and retested on Phoenix (serial / debug)
only before this checkin.
|
|
|
|
|
|
|
|
|
|
|
| |
Description: Converted the global heap metadata cache clients over to
use the new journaling cache callbacks.
Separated cache clients into new H5HGcache.c file. Added
H5HGcache.c to MANIFEST, added into src/Makefile.am, and
ran bin/reconfigure to regenerate Makefile.in file.
Tested: kagiso, smirom, linew, duty
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
revised cache.
Note that this conversion is not as efficient as it should be. Specifically,
it does it more memcpy's between the metadata cache's on disk image of the
direct block and the fractal heap's on disk image of the direct block than
is absolutely essential. Eventually, we will want to fix this -- probably
by allowing the metadata cache and the fractal heap direct block to share
a common on disk image of the direct block. However, this will require
extensions to the client / metadata cache interface, and some reworking of
the fractal heap as well.
This checkin also includes Mike M's fix to the Linew specific bug mentioned
in my checkin of 22 Aug 2008.
Tested on Phoenix (serial debug and production),
Kagiso (parallel), and
Linew (serial)
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
/bin/tar: Argument list too long.
After some digging, I found that the tar arguments list generated from
MANIFEST was over 138940 bytes long. This was due to both more number
of files (2153) in the manifest to be distributed and the version name
(hdf5-1.9.8-metadata_journaling_a1) was a bit long. I changed the version
name to hdf5-1.9.8-MDJ_a1 which is 16 bytes shorter, resulting the argument
list to be 34448 bytes shorter. That seems acceptable to kagiso as the
tar command ran successfully.
So, I abbreviated the version name. This is a temporary fix. A real fix
would be to use consecutive tar commands, each with a reasonably shorter
list of arguments to generate the final tar file.
|
| |
|
|
|
|
|
|
| |
prepare for
an Alpha1 release.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1) Removed references to H5AC2__CURR_JNL_CONFIG_VER from H5Fget_jnl_config()
and H5Pget_jnl_config(), and also references to
H5AC__CURR_CACHE_CONFIG_VERSION from H5Fget_mdc_config() and
H5Pget_mdc_config().
2) Removed H5Pset_journal() from H5C2journal.c, and modified test
code to use H5F/Pget/set_mdj_config() instead.
3) Implemented support for callbacks on metadata journaling status change
along with the associated registration / deregistration calls and
associated test code.
4) Fixed bug in journaling shutdown exposed by 3 above.
Tested and passed on Phoenix (serial), Linew (serial), and Kagiso (parallel).
However, while I was testing there were a couple of checkins, forcing
an update and second round of testing.
On the second round, tested and passed on Phoenix (serial) and
Kagiso (parallel), but failed on Linew (serial).
As best I can tell, this was caused by Mike M's checkin -- which
broke the smoke checks in cache2_journal on Linew but not Phoenix
or Kagiso. A typical delta in the architype files follows:
linew.hdfgroup.uiuc.edu% diff -ctw cache2_journal_sc00_000.jnl tmp/cache2_journal_sc00_000.jnl
*** cache2_journal_sc00_000.jnl Fri Aug 22 08:28:49 2008
--- tmp/cache2_journal_sc00_000.jnl Fri Aug 22 05:08:41 2008
***************
*** 1,5 ****
! 0 ver_num 1 target_file_name cache_journal_test.h5 creation_date Fri Aug 22 human_readable 1
! E eoa_value 0x0
C comment Begin transaction on transaction 1.0.
1 bgn_trans 1
2 trans_num 1 length 1 base_addr 0x401 body 01
--- 1,5 ----
! 0 ver_num 1 target_file_name cache_journal_test.h5 creation_date Wed Aug 20 human_readable 1
! E eoa_value 0x772a9c01
C comment Begin transaction on transaction 1.0.
1 bgn_trans 1
2 trans_num 1 length 1 base_addr 0x401 body 01
As you can see, it looks like garbage is getting into the first
eoa write on Linew.
I'm checking in anyway, as Quincey needs my changes, and I will not
have time to work on this for several days.
Mike: Let me know if you are tackling this one -- if not, I'll deal with it.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- EOA logging update
Description:
- EOA values will now be written to the journal
file in their own transaction when the EOA
changes.
- The EOA will be udpated in the HDF5 file's
superblock before the recovery process begins.
This should prevent some loss of raw data as the
file won't be getting truncated upon file open as
it will read the correct EOA value from the
superblock.
- Removed storing of EOA in journal entry messages
since they're in their own transaction.
- Updated tests to reflect change of transaction
formats. Regenerated smoke test files to account
for new entry types, and tweaked transaction number
tests to reflect change in size of journal entries.
- Large testfiles (in test/testfiles) should now
unzip when ./configure is run.
- When journal file is supplied but contains no
complete transactions, instead of reporting
an error, h5recover now informs the user of said
nonexistant transactions, and opens/closes the
hdf5 file with the journal recovered flag set.
- Other various organizational changes to h5recover,
included a bit more added to verbose output.
Tested:
- kagiso, smirom
|
|
|
|
|
|
|
| |
Description: converted the fractal heap header and indirect block metadata
cache clients over to use the new journaling cache callbacks.
Tested: kagiso, smirom
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
from the H5AC2_cache_config_t structure and the associated
H5P/H5Fget/set_mdc_config() API calls, and into the newly created
H5AC2_jnl_config_t structure and H5P/H5Fget/set_jnl_config() calls.
Updated test code accordingly.
Updated the trace file test code for journaling.
Also folded in a fix to an assertion bug in H5C2pkh.h
Tested serial on Phoenix and Linew, and parallel (with and without the
trace file enabled) on kagiso.
|
|
|
|
|
|
|
|
| |
Description: converted the shared object header message and index stored
as a list metadata cache clients over to use the new journaling
cache callbacks.
Tested: kagiso, smirom
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Convert object header cache client to use the new metadata journaling
cache, which included adding a new client for handling continuation chunks.
Added "real" protect calls around modifying chunks in object headers.
Switched a few more metadata cache library API routines to drop the
file pointer, when it is not needed (pinning/unpinning entries, etc.)
Fixed bug in journaling cache handling of 'image_len' callbacks and
also changed cache to retry deserializing entries when the entry's size is
larger than the speculative size initially tried.
Retrying for 'image_len' callbacks has problems with the 'multi' VFD,
so the h5dump and FORTRAN 'multi' tests are commented out, until the changes to
the 'multi' VFD from the file free space branch are brought back into the
trunk.
Currently, the 'h5recover' tool has a bug which requires it to be run
twice before replaying the journal "sticks". However, this is from an earlier
checkin, since the code in the branch already has this behavior... :-(
Tested on:
FreeBSD/32 6.2 (duty) in debug mode
FreeBSD/64 6.2 (liberty) w/C++ & FORTRAN, in debug mode
Linux/32 2.6 (kagiso) w/PGI compilers, w/C++ & FORTRAN, w/threadsafe,
in debug mode
Linux/64-amd64 2.6 (smirom) w/default API=1.6.x, w/C++ & FORTRAN,
in production mode
Linux/64-ia64 2.6 (cobalt) w/Intel compilers, w/C++ & FORTRAN,
in production mode
Solaris/32 2.10 (linew) w/deprecated symbols disabled, w/C++ & FORTRAN,
w/szip filter, in production mode
Mac OS X/32 10.5.4 (amazon) in debug mode
Linux/64-ia64 2.4 (tg-login3) w/parallel, w/FORTRAN, in production mode
|
|
|
|
|
|
|
|
|
| |
journaling,
along with test code that uncovered the problem and associated additions &
changes to the gziped test files in test/testfiles
Tested serial on Phoenix and parallel on kagiso
|
|
|
|
|
|
|
|
| |
identify
this is a feature branch.
H5committested.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
specifically:
1) Fix for failure to detect journaling in progress on HDF5 files which
were not closed correctly. Also associated test code.
Note that this required addition of code to test for journaling in
progress and enable journaling at the end of H5F_flush(). In passing,
I was able to get rid of the wacky code that queued journaling setup
at cache creation time.
2) Test code for startup and shutdown of journaling on an open file.
3) Updates to start checking journal output against architype files
instead of just generating architypes at test time.
Note that per Quincey's request, I have checked in gziped versions
of the architype files. At some point, we will have to add code
to automatically unzip these files, but for the time being you will
have to go to test/testfiles and "gunzip *.gz". The journal tests
will still pass if you don't, but you will get a warning about
missing test files.
4) Fixed bug in journal entry logging code that allowed a comment to
appear in the journal file before the journal file header. (Mike M,:
Please review my fix to verify that I haven't clobbered anything.)
5) Additional test code.
Note that more test code would be a good idea, but this set of bug
fixes should be enough to get us through the basic demo -- at least
as far as the metadata cache is concerned.
Tested serial on Phoenix, and parallel on Kagiso. Also, tested serial
on Linew just prior to some last minute minor edits.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Disable dataset caching when journaling is enabled.
Tested on:
FreeBSD/32 6.2 (duty) in debug mode
FreeBSD/64 6.2 (liberty) w/C++ & FORTRAN, in debug mode
Linux/32 2.6 (kagiso) w/PGI compilers, w/C++ & FORTRAN, w/threadsafe,
in debug mode
Linux/64-amd64 2.6 (smirom) w/default API=1.6.x, w/C++ & FORTRAN,
in production mode
Linux/64-ia64 2.6 (cobalt) w/Intel compilers, w/C++ & FORTRAN,
in production mode
Solaris/32 2.10 (linew) w/deprecated symbols disabled, w/C++ & FORTRAN,
w/szip filter, in production mode
Mac OS X/32 10.5.2 (amazon) in debug mode
Linux/64-ia64 2.4 (tg-login3) w/parallel, w/FORTRAN, in production mode
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bring revisions 15210:15289 from the trunk into the metadata journaling branch.
Tested on:
FreeBSD/32 6.2 (duty) in debug mode
FreeBSD/64 6.2 (liberty) w/C++ & FORTRAN, in debug mode
Linux/32 2.6 (kagiso) w/PGI compilers, w/C++ & FORTRAN, w/threadsafe,
in debug mode
Linux/64-amd64 2.6 (smirom) w/default API=1.6.x, w/C++ & FORTRAN,
in production mode
Linux/64-ia64 2.6 (cobalt) w/Intel compilers, w/C++ & FORTRAN,
in production mode
Solaris/32 2.10 (linew) w/deprecated symbols disabled, w/C++ & FORTRAN,
w/szip filter, in production mode
Linux/64-ia64 2.4 (tg-login3) w/parallel, w/FORTRAN, in production mode
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Merge revisions 15130:15210 from trunk into metadata journaling branch
Tested on:
FreeBSD/32 6.2 (duty) in debug mode
FreeBSD/64 6.2 (liberty) w/C++ & FORTRAN, in debug mode
Linux/32 2.6 (kagiso) w/PGI compilers, w/C++ & FORTRAN, w/threadsafe,
in debug mode
Linux/64-amd64 2.6 (smirom) w/default API=1.6.x, w/C++ & FORTRAN,
in production mode
Linux/64-ia64 2.6 (cobalt) w/Intel compilers, w/C++ & FORTRAN,
in production mode
Solaris/32 2.10 (linew) w/deprecated symbols disabled, w/C++ & FORTRAN,
w/szip filter, in production mode
Linux/64-ia64 2.4 (tg-login3) w/parallel, w/FORTRAN, in production mode
|
|
|
|
|
|
|
|
| |
Description: converted the v2 b-tree header, internal node, and leaf node
metadata cache clients over to use the new journaling cache
callbacks.
Tested: kagiso, smirom
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Merge revisions 15037:15130 from trunk into metadata journaling branch
Tested on:
FreeBSD/32 6.2 (duty) in debug mode
FreeBSD/64 6.2 (liberty) w/C++ & FORTRAN, in debug mode
Linux/32 2.6 (kagiso) w/PGI compilers, w/C++ & FORTRAN, w/threadsafe,
in debug mode
Linux/64-amd64 2.6 (smirom) w/default API=1.6.x, w/C++ & FORTRAN,
in production mode
Linux/64-ia64 2.6 (cobalt) w/Intel compilers, w/C++ & FORTRAN,
in production mode
Solaris/32 2.10 (linew) w/deprecated symbols disabled, w/C++ & FORTRAN,
w/szip filter, in production mode
Mac OS X/32 10.5.2 (amazon) in debug mode
Linux/64-ia64 2.4 (tg-login3) w/parallel, w/FORTRAN, in production mode
|
|
|
|
|
|
|
|
| |
Description: Removing a broken assertion. In my last checkin I moved the
location where bodydata (in H5C2_jb__journal_entry) gets
initialized, and forget to update the assertion appropriately.
Tested: kagiso
|
|
|
|
|
|
|
|
|
| |
section info
metadata cache clients to use the new journaling cache
callbacks.
Tested on: kagiso, smirom
|
|
|
|
|
|
|
|
|
| |
Description: Journal entry data was not getting copied correctly before being
converted into hex, and thus journal entries were getting
corrupted. H5C2_jb__journal_entry now uses HDmemcpy, which should
fix the issue.
Tested: kagiso
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Merge revisions 14900:15037 from trunk into metadata journaling branch
Tested on:
FreeBSD/32 6.2 (duty) in debug mode
FreeBSD/64 6.2 (liberty) w/C++ & FORTRAN, in debug mode
Linux/32 2.6 (kagiso) w/PGI compilers, w/C++ & FORTRAN, w/threadsafe,
in debug mode
Linux/64-amd64 2.6 (smirom) w/default API=1.6.x, w/C++ & FORTRAN,
in production mode
Linux/64-ia64 2.6 (cobalt) w/Intel compilers, w/C++ & FORTRAN,
in production mode
Solaris/32 2.10 (linew) w/deprecated symbols disabled, w/C++ & FORTRAN,
w/szip filter, in production mode
Mac OS X/32 10.5.2 (amazon) in debug mode
Linux/64-ia64 2.4 (tg-login3) w/parallel, w/FORTRAN, in production mode
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bring revisions 14800:14900 from trunk into metadata journaling branch
Tested on:
FreeBSD/32 6.2 (duty) in debug mode
FreeBSD/64 6.2 (liberty) w/C++ & FORTRAN, in debug mode
Linux/32 2.6 (kagiso) w/PGI compilers, w/C++ & FORTRAN, w/threadsafe,
in debug mode
Linux/64-amd64 2.6 (smirom) w/default API=1.6.x, w/C++ & FORTRAN,
in production mode
Linux/64-ia64 2.6 (cobalt) w/Intel compilers, w/C++ & FORTRAN,
in production mode
Solaris/32 2.10 (linew) w/deprecated symbols disabled, w/C++ & FORTRAN,
w/szip filter, in production mode
Mac OS X/32 10.5.2 (amazon) in debug mode
Linux/64-ia64 2.4 (tg-login3) w/parallel, w/FORTRAN, in production mode
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bring back revisions 14700:14800 from the trunk
Tested on:
FreeBSD/32 6.2 (duty) in debug mode
FreeBSD/64 6.2 (liberty) w/C++ & FORTRAN, in debug mode
Linux/32 2.6 (kagiso) w/PGI compilers, w/C++ & FORTRAN, w/threadsafe,
in debug mode
Linux/64-amd64 2.6 (smirom) w/default API=1.6.x, w/C++ & FORTRAN,
in production mode
Linux/64-ia64 2.6 (cobalt) w/Intel compilers, w/C++ & FORTRAN,
in production mode
Solaris/32 2.10 (linew) w/deprecated symbols disabled, w/C++ & FORTRAN,
w/szip filter, in production mode
Mac OS X/32 10.5.2 (amazon) in debug mode
Linux/64-ia64 2.4 (tg-login3) w/parallel, w/FORTRAN, in production mode
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Merge revisions 14525:14700 from trunk into metadata journaling branch
Tested on:
FreeBSD/32 6.2 (duty) in debug mode
FreeBSD/64 6.2 (liberty) w/C++ & FORTRAN, in debug mode
Linux/32 2.6 (kagiso) w/PGI compilers, w/C++ & FORTRAN, w/threadsafe,
in debug mode
Linux/64-amd64 2.6 (smirom) w/default API=1.6.x, w/C++ & FORTRAN,
in production mode
Linux/64-ia64 2.6 (cobalt) w/Intel compilers, w/C++ & FORTRAN,
in production mode
Solaris/32 2.10 (linew) w/deprecated symbols disabled, w/C++ & FORTRAN,
w/szip filter, in production mode
Mac OS X/32 10.5.2 (amazon) in debug mode
Linux/64-ia64 2.4 (tg-login3) w/parallel, w/FORTRAN, in production mode
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bring changes from trunk from the time the branch was created (r14280)
up to the 1.8.0 release (r14525) back into the metadata journaling branch.
Tested on:
FreeBSD/32 6.2 (duty) in debug mode
FreeBSD/64 6.2 (liberty) w/C++ & FORTRAN, in debug mode
Linux/32 2.6 (kagiso) w/PGI compilers, w/C++ & FORTRAN, w/threadsafe,
in debug mode
Linux/64-amd64 2.6 (smirom) w/default API=1.6.x, w/C++ & FORTRAN,
in production mode
Linux/64-ia64 2.6 (cobalt) w/Intel compilers, w/C++ & FORTRAN,
in production mode
Solaris/32 2.10 (linew) w/deprecated symbols disabled, w/C++ & FORTRAN,
w/szip filter, in production mode
Linux/64-ia64 2.4 (tg-login3) w/parallel, w/FORTRAN, in production mode
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
value into
journal entries to be used by the recovery tool.
This value is only really neded once per transaction, and only when
the EOA changes, so rather than putting it into each journal entry,
this should be moved into its own transaction type. However, in order
to speed testing along, this quick fix has been implemented for the
time being.
Modified h5recover tool to use eoa value as well as journaling tests
accordingly.
Tested: kagiso
|
|
|
|
|
|
|
|
| |
H5Ppublic.h.
Added H5AC2public.h to hdf5.h.
Tested: kagiso.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Added a pointer to the cache that an entry is contained within to the
cache entry structure. This allows us to remove the file pointer from some of
the H5AC2 calls, easing the conversion of some of the cache clients (the free
space section info and fractal heap direct blocks, and probably others).
Removed file pointer from the H5AC2_unpin_entry() call.
Tested on:
FreeBSD/32 6.2 (duty) in debug mode
FreeBSD/64 6.2 (liberty) w/C++ & FORTRAN, in debug mode
Linux/32 2.6 (kagiso) w/PGI compilers, w/C++ & FORTRAN, w/threadsafe,
in debug mode
Linux/64-amd64 2.6 (smirom) w/default API=1.6.x, w/C++ & FORTRAN,
in production mode
Linux/64-ia64 2.6 (cobalt) w/Intel compilers, w/C++ & FORTRAN,
in production mode
Solaris/32 2.10 (linew) w/deprecated symbols disabled, w/C++ & FORTRAN,
w/szip filter, in production mode
Mac OS X/32 10.5.2 (amazon) in debug mode
Linux/64-ia64 2.4 (tg-login3) w/parallel, w/FORTRAN, in production mode
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Description: Changed H5C2_jb__journal_entry function to make a copy of the
incoming journal entry before doing anything with it. I was seeing
errors in the journals produced by using the pointer passed to me,
so copying the data beforehand looks to solve the problem.
Also made a quick change to h5recover.c to use generated fapl
when opening the recovered HDF5 file. (was previously using
H5P_DEFAULT).
Tested: kagiso
|
|
|
|
| |
Tested: kagiso.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Convert the symbol table node metadata cache client to use the new
journaling cache callbacks.
Also added a 'H5F_t *' parameter to the 'serialize' callback for the
journaling cache, which makes the client's job much easier.
Various minor coding cleanups, etc. also.
Tested on:
FreeBSD/32 6.2 (duty) in debug mode
FreeBSD/64 6.2 (liberty) w/C++ & FORTRAN, in debug mode
Linux/32 2.6 (kagiso) w/PGI compilers, w/C++ & FORTRAN, w/threadsafe,
in debug mode
Linux/64-amd64 2.6 (smirom) w/default API=1.6.x, w/C++ & FORTRAN,
in production mode
Linux/64-ia64 2.6 (cobalt) w/Intel compilers, w/C++ & FORTRAN,
in production mode
Solaris/32 2.10 (linew) w/deprecated symbols disabled, w/C++ & FORTRAN,
w/szip filter, in production mode
Mac OS X/32 10.5.3 (amazon) in debug mode
Linux/64-ia64 2.4 (tg-login3) w/parallel, w/FORTRAN, in production mode
|
|
|
|
|
|
|
|
| |
Rename H5Pset_fapl_journal as H5Pset_journal.
Use the public constant of H5AC2__CURR_CACHE_CONFIG_VERSION.
Tested: h5committested. (tools/h5recover/testh5reover.sh had failures but
that was because ../h5diff/h5diff's failure.)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Updates to test opening of a file created with journaling,
along with associated debugging modifications.
(Mike M. To get journal deletion to work correctly, I
had to modify H5C2_jb__init() to allocate a buffer for the
journal file name and copy it into the buffer. Similarly,
I had to modify to H5C2_jb__takedown() to free the buffer.
The fix was hurried, and should be reviewed. Also, a
similar fix is probably in order for the HDF5 file name.)
* Fix for the bug Albert reported on Linew.
* An attempt to apply the changes Quincey requested to the
loc_id parameters to the FUNC_ENTER_API_META macro calls
in:
H5Gmove2(), (src_loc_id --> dst_loc_id)
H5Lcopy(), (src_loc_id --> dst_loc_id)
H5Lmove(), (src_loc_id --> dst_loc_id)
H5Glink2(), (cur_loc_id --> new_loc_id)
H5Lmove() (cur_loc_id --> new_loc_id)
However, with the exception of the requested change to
H5Gmove2(), all these chages caused us to fail the
regression tests. Thus only the H5Gmove2() change is
made.
Several caviats and warnings:
* If you build and test this checkin, it will fail on the
on the test for trecover.
This showed up after I updated my project, so initially
I thought I had broken something. However, after examining
the problem for a while, I thought to checkout the version
prior to this checkin, and test to see if the problem appeared.
It did (under serial on phoenix, and parallel on Kagiso),
so I am going ahead with this checkin regardless under the
assumption that it is orthoginal to my changes.
* Low level testing for the journaling feature of the metadata
cache is not complete. The coverage of the existing tests
is good enough that I don't expect anything major, but don't
be surprised if you run into problems around the edges.
In particular, enabling and disabling journaling while the
file is open has not been tested at all. Suggest we stay
away from this until it gets at least a once over.
* The metadata journaling smoke check tests in cache2_journal.c
are still configured to generate the architype files used to
check journal output. This can be turned off any time, but
given Quincey's constaints on test file size, I have to write
code to skip the tests if the architype files are missing,
and then put compressed versions of the architype files in
svn before I do so. Unfortunately, there is no time before
I leave.
* I left a good bit of debugging code in both the journaling
code proper, and the associated test code. It should all
be #if 0'ed out at present, but if you run into it, you
know what is going on. Needless to say, I'll delete it when
I finish testing.
* I was not able to reproduce the bug Albert observed on RedStorm
locally, so I don't have a fix for it. That said, I touched
some things that could have caused it, so it is possible that
I fixed it by accident.
Testing:
Before I updated, I was able to build and test serial on Phoenix
and Linew, and parallel on Kagiso without errors in the regression
tests.
As discussed above, after the update, I failed in the test for
trecover in a serial build and test on Phoenix, and parallel build
and test on Kagiso. Linew is slow, so I didn't attempt a test there.
Since the same failures appear in the verion prior to this checkin,
I am going ahead with the checkin regardless on the assumption that
the problem is orthoginal to my changes.
|