| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
Correct error reported by user (Martin Otte) where we weren't using
realloc'ed buffer in MPI datatype code.
Tested on:
Mac OSX/64 10.7.3 (amazon) w/parallel
|
|
|
|
|
|
|
|
| |
Refactor function name macros and simplify the FUNC_ENTER macros, to clear
away the cruft and prepare for further cleanups.
Tested on:
Mac OSX/64 10.7.3 (amazon) w/debug, production & parallel
|
|
|
|
|
|
|
| |
Clean up compiler warnings and neaten up code a bit.
Tested on:
Linxu/64 2.6.32 (ember) w/parallel
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bring "shape same" changes from LBL branch to trunk. These changes
allow shapes that are the same, but projected into dataspaces with different
ranks to be detected correctly, and also contains code to project a dataspace
into greater/lesser number of dimensions, so the I/O can proceed in a faster
way.
These changes also contain several bug fixes and _lots_ of code
cleanups to the MPI datatype creation code.
Many other misc. code cleanup are included as well...
Tested on:
FreeBSD/32 6.3 (duty) in debug mode
FreeBSD/64 6.3 (liberty) w/C++ & FORTRAN, in debug mode
Linux/32 2.6 (jam) w/PGI compilers, w/default API=1.8.x,
w/C++ & FORTRAN, w/threadsafe, in debug mode
Linux/64-amd64 2.6 (amani) w/Intel compilers, w/default API=1.6.x,
w/C++ & FORTRAN, in production mode
Solaris/32 2.10 (linew) w/deprecated symbols disabled, w/C++ & FORTRAN,
w/szip filter, in production mode
Linux/64-ia64 2.6 (cobalt) w/Intel compilers, w/C++ & FORTRAN,
in production mode
Linux/64-amd64 2.6 (abe) w/parallel, w/FORTRAN, in debug mode
Mac OS X/32 10.6.3 (amazon) in debug mode
Mac OS X/32 10.6.3 (amazon) w/C++ & FORTRAN, w/threadsafe,
in production mode
|
|
|
|
|
|
|
|
| |
Minor whitespace and compiler warning cleanups
Tested on:
Mac OS X/32 10.6.3 (amazon) w/debug
(too minor to require h5committest)
|
|
|
|
|
|
|
|
|
|
| |
Remove another call to H5E_clear_stack() from within the library.
Clean up lots of compiler warnings.
Tested on:
Mac OS X/32 10.5.6 (amazon)
(followup on other platforms forthcoming)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove trailing whitespace from C/C++ source files, with the following
script:
foreach f (*.[ch] *.cpp)
sed 's/[[:blank:]]*$//' $f > sed.out && mv sed.out $f
end
Tested on:
Mac OS X/32 10.5.5 (amazon)
No need for h5committest, just whitespace changes...
|
| |
|
|
|
|
| |
New fortran wrappers added.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Omnibus raw data I/O revisions, with wide-ranging changes and
refactoring, in order to prepare for implementing "fast append" feature.
These changes remove the majority of the code duplication for raw data
I/O which has crept in over the last ten years and introduces a more object-
oriented design for operating on different types of dataset storage.
Chunked storage no longer has it's own I/O routines, it is now handled
as either contiguous (if chunk is not pulled into the cache) or compact (if the
chunk is cached in memory).
No bug or feature changes, at least intentionally... :-)
Tested on:
FreeBSD/32 6.2 (duty) in debug mode
FreeBSD/64 6.2 (liberty) w/C++ & FORTRAN, in debug mode
Linux/32 2.6 (kagiso) w/PGI compilers, w/C++ & FORTRAN, w/threadsafe,
in debug mode
Linux/64-amd64 2.6 (smirom) w/default API=1.6.x, w/C++ & FORTRAN,
in production mode
Linux/64-ia64 2.6 (cobalt) w/Intel compilers, w/C++ & FORTRAN,
in production mode
Solaris/32 2.10 (linew) w/deprecated symbols disabled, w/C++ & FORTRAN,
w/szip filter, in production mode
Mac OS X/32 10.5.2 (amazon) in debug mode
Linux/64-ia64 2.4 (tg-login3) w/parallel, w/FORTRAN, in production mode
|
|
|
|
|
|
| |
using H5DEBUG.
Have tested at linux to make sure the change won't cause any compiling errors or testing errors.
|
|
|
|
|
|
|
|
|
| |
copyright notice.
Tested platform:
Kagiso only since it is only a comment block change. If it works in one
machine, it should work in all, I hope. Still need to check the parallel
build on copper.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Code cleanup
Description:
Remove remnents of references to unimplemented H5S_COMPLEX dataspace.
Platforms tested:
FreeBSD 4.11 (sleipnir)
Linux 2.4 32-bit (heping)
Linux 2.4 64-bit (mir)
Solaris 2.9 (shanti)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Purpose:
Add some comments and error checking codes for collective chunk IO code.
Description:
Solution:
Platforms tested:
AIX 5.1(copper), too minor to test at other platforms.
Misc. update:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Purpose:
bug fix for collective chunk IO
Description:
When using calloc to allocate an MPI datatype, one should use sizeof(MPI datatype) instead of sizeof(equalivent non-MPI datatype) to assign the value, this causes the problem at 64-bit platforms such IRIX65 and AIX 5.1.
Solution:
Correct it.
Platforms tested:
AIX 5.1 and IRIX6.5(at NCAR).
Misc. update:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Code cleanup
Description:
Trim trailing whitespace, which is making 'diff'ing the two branches
difficult.
Solution:
Ran this script in each directory:
foreach f (*.[ch] *.cpp)
sed 's/[[:blank:]]*$//' $f > sed.out && mv sed.out $f
end
Platforms tested:
FreeBSD 4.11 (sleipnir)
Too minor to require h5committest
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Code cleanup
Description:
Refactor, simplify and cleanup Kent's recent checking for collective
chunk I/O. There's a bug that I need to talk to Kent about and some more
cleanups still, but this is reasonable for an interim point.
Platforms tested:
FreeBSD 4.11 (sleipnir) w/parallel
Too minor for h5committest
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
bug fix for collective chunk IO, phase 1
Optimization hasn't been done yet, the collective chunk IO bug should be fixed.
Description:
In chunking storage, memory space and file space will be remapped, So to check
whether file space and memory space are regular in order to use optimized MPI derived
datatype for collective call one has to check per-chunk wise instead of per hyperslab wise.
Even a regular memory space will be stored in span-tree and will be irregular before chunk IO.
Solution:
1. Check file space and memory space per chunk wise instead of per hyperslab wise.
2. For collective IO mode, number of chunks covered by hyperslab may be different. Since we are
handing per chunk per IO, for the extra chunk IO for some(not all) processors, collective mode will
cause program hanged. So for the extra chunk Io mode independent IO has to be used.
3. On some platforms, Complex MPI derived datatype is not working, so we have to use independent IO for collective IO mode if the selection is irregular. However, when the selection is regular, we do want to use collective IO since that will improve performance. Special cares have to be added for this case.
Platforms tested:
copper(AIX 5.1) Linux(heping mpich 1.2.6), Teragrid machine, Cobalt(altix), modi4
Misc. update:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Activating collective IO supports for irregular selction inside HDF5 dataset.
This support doesn't include to build the final complicated MPI derived datatype support for chunking storage.
Description:
Support collective chunk IO for both contiguous and chunking storage for irregular selection( using H5S_SELECT_OR for multiple hyperslab selection)
Solution:
Using MPI derived datatype to realize this feature.
Problems still need to be investigated:
Big size irregular hyperslab selection test may cause MPI hang or abnormalexit with MPICH family on various platforms. This is really hard to debug since sometimes it can work and sometimes it cannot work. We will continue investigating those cases. This may not be parallel HDF5 bugs since with the recent version of poe at IBM platforms, all tests passed.
Platforms tested:
1. Linux heping 2.4.21-27.0.1 with mpich
2. AIX 5.1 copper with mpcc_r
3. Altix cobalt SGI linux 2.4.21-sgi304rp05031014_10149 with icc -lmpi
4. Linux Cluster SDSC TG, intel 8-r2 with mpich 1.2.5
5. NCSA Linux Cluster Tungsten, MPICH-TCP-1.2.5.2, Intel 8.0 under lustre
6. NCSA Linux Cluster Tungsten, MPICH-LAM-INTEL711, sometimes not working
7. NCSA Linux CLuster Tungsten, champion-pro-1.2.0-3, not working for other collective IO tests, but work for irregular selection collective IO test.
Misc. update:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Description: Removed PABLO from the source
Solution:
Platforms tested: arabica with 64-bit, copper with parallel,
heping with GNU C and C++ and PGI fortran (but
I disabled hl, there is some weird problem only
on heping: F9XMODFLAG is not
propagated to the Makefile files
Misc. update:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bug Fix/Code Cleanup/Doc Cleanup/Optimization/Branch Sync :-)
Description:
Generally speaking, this is the "signed->unsigned" change to selections.
However, in the process of merging code back, things got stickier and stickier
until I ended up doing a big "sync the two branches up" operation. So... I
brought back all the "infrastructure" fixes from the development branch to the
release branch (which I think were actually making some improvement in
performance) as well as fixed several bugs which had been fixed in one branch,
but not the other.
I've also tagged the repository before making this checkin with the label
"before_signed_unsigned_changes".
Platforms tested:
FreeBSD 4.10 (sleipnir) w/parallel & fphdf5
FreeBSD 4.10 (sleipnir) w/threadsafe
FreeBSD 4.10 (sleipnir) w/backward compatibility
Solaris 2.7 (arabica) w/"purify options"
Solaris 2.8 (sol) w/FORTRAN & C++
AIX 5.x (copper) w/parallel & FORTRAN
IRIX64 6.5 (modi4) w/FORTRAN
Linux 2.4 (heping) w/FORTRAN & C++
Misc. update:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Check in some new fixes for MPI derived datatype routines
Description:
MPI derived datatype algorithm seems working for a simple case;
however, there are still other problems need to be solved. So the
code cannot be used for the time being. Check-in only for debugging.
It won't affect other part of the library.
Solution:
Platforms tested:
Linux 2.4 (heping, serial and parallel)
(Since no new tests were added and changes are mostly restricted to
one fuction, no need to test three platforms).
Misc. update:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adding codes for the general MPI derived datatype in order to
better incorporate new fixes of HDF5 library.
Description:
Note: These codes have not been tested for general use.
Don't call these functions in your developments of the HDF5 library.
Also these codes are stand-alone codes, they should not affect
other library codes.
Solution:
Platforms tested:
Heping(C and Parallel linux 2.4, mpich 1.2.6)
Arabica(C,C++,Fortran, Solaris 2.7)
Copper(C,c++,Fortran, AIX 5.1, NOTE: c++ FAILED, seems not due to the recent check-in)
Misc. update:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bug fix & code cleanup
Description:
More dataset cleanups to get to a point where we can fix the chunked I/O
bug.
Also fix a couple of errors in the recent file object resurrection changes
which should hopefully address the recent daily test failres (H5T.c)
Platforms tested:
FreeBSD 4.10 (sleipnir) w/parallel
Solaris 2.7 (arabica)
h5committest
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bug fix/code cleanup
Description:
Clean up raw data I/O code to bundle the I/O parameters (dataset, DXPL ID,
etc) into a single struct to pass around through the dataset I/O routines,
since they are always passed together, until very near the bottom of the I/O
stack.
Platforms tested:
FreeBSD 4.10 (sleipnir) w/parallel
Solaris 2.7 (arabica)
IRIX64 6.5 (modi4)
h5committest
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bug fix (Failures when dataset size >= 1 GB, reported by Bill Loewe.)
Description:
In the IBM AIX system using 32bit mode, if a dataset size was 1GB or
larger, when the "end" of the dataset was selected, MPI would complain
it could not keep the Upper bound of a datatype within the range of
MPI_Aint. This was because the old algorithm would derive the selection
with extent of each row first. After all dimensions were processed,
it then calculate the start position and just displace the whole
MPI derived type. So, the final MPI type was actually the start
position plus the whole dataset. Since the start can be as big as
the whole dataset, this made the final derived twice as big as 1GB.
That would hit the 2GB MPI_Aint range limit in the 32 bit mode.
Solution:
Use a different algorithm to include the start position in the
defining of MPI type for each dimension. When all dimensions
are processed, the MPI type represents the selection exactly.
Platforms tested:
h5committested
Misc. update:
|
|
|
|
|
|
|
|
|
|
|
|
| |
Code cleanup
Description:
Fix another batch of minor differences between the development and release
branches.
Platforms tested:
FreeBSD 4.10 (sleipnir) w/parallel
Too minor to require h5committest
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Code cleanup
Description:
Clean up collective chunking code a bit.
Also, add '--enable-instrument' configure flag to have a mechanism for
determining that optimized operations happened correctly in the library (instead
of just the "normal" way) by allowing 'flag' properties to be set outside the
library and set when the "right" thing happens. This is mainly for debugging
and regression checks, so we make certain we don't break optimized I/O by
accident. It's enabled by default when --enable-debug is on (which is on by
default in the development branch and off by default in the release branch),
but can also be independently controlled with its own configure flag.
Platforms tested:
FreeBSD 4.10 (sleipnir) w/parallel
IBM p690 (copper) w/parallel
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adding the first round of patches about supporting collective chunk IO in HDF5
Description:
The current HDF5 library doesn't support collective MPIO with chunk storage. When users set collective option in their data transfer with chunk storage, the library silently converted the option to INDEPENDENT and that caused trememdous performance penalty. Some application like
WRF-parallel HDF5 IO module has to use contiguous storage for this reason. However, chunking storage has its own advantage(supporting compression filters and extensible dataset), so to make collective MPIO possible inside HDF5 with chunking storage is a very important task.
This check-in make collective chunk IO possible for some special cases. The condition is as follows(either case is fine with using collective chunk IO)
1. for each process, the hyperslab selection of the file data space of each dataset is regular and it is fit in one chunk.
2. for each process, the hyperslab selection of the file data space of each dataset is single and the number of chunks for the hyperslab selection should be equal.
Solution:
Lift up the contiguous storage requirement for collective IO.
Use H5D_isstore_get_addr to get the corresponding chunk address. Then the original library routines will take care of getting the correct address to make sure that MPI FILE TYPE is built correctly for collective IO>
Platforms tested:
arabica(sol), copper(AIX), eirene(Linux)
parallel test is checked at copper.
Misc. update:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Code cleanup & minor optimization
Description:
Re-work the way interface initialization routines are specified in the
library to avoid the overhead of checking for them in routines where there is
no interface initialization routine. This cleans up warnings with gcc 3.4,
reduces the library binary size a bit (about 2-3%) and should speedup the
library's execution slightly.
Platforms tested:
FreeBSD 4.10 (sleipnir) w/gcc34
h5committest
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Code cleanup & optimization
Description:
Remove old structures that used a union to store information about the
dataspace extent and just store the information directly in the dataspace
extent itself.
Remove ifdef'd references to permutation ordering in dataspaces. We'll
definitely need more than this code if/when we implement this feature.
Change allocation of dataspace information from calloc() to malloc().
Platforms tested:
Solaris 2.7 (arabica)
FreeBSD 4.10 (sleipnir) w/parallel
Too minor to require h5committest
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Code optimization
Description:
Revised dataspace selections to use a more "object oriented" mechanism
to set the function pointers for each selection and selection iterator. This
reduces the amount and number of times that dataspace selection info has to
be copied.
Additionally, change hyperslab selection information to be dynamically
allocated instead of an inline struct.
Platforms tested:
Solaris 2.7 (arabica)
FreeBSD 4.10 (sleipnir) w/parallel
Too minor to require h5committest
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Refactor code
Description:
Move chunk and contiguous cached raw data from file information to dataset
information. This simplifies a number of internal interfaces, aligns the
code with it's purpose better and should allow more optimizations to the
chunked data I/O performance.
Platforms tested:
Solaris 2.7 (arabica)
FreeBSD 4.10 (sleipnir)
h5committest
Misc. update:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Code optimization & bug fix
Description:
When dimension information is being stored in the storage layout message
on disk, it is stored as 32-bit quantities, possibly truncating the dimension
information, if a dimension is greater than 32-bits in size.
Solution:
Fix the storage layout message problem by revising file format to not store
dimension information, since it is already available in the dataspace.
Also revise the storage layout data structures to be more compartmentalized
for the information for contiguous, chunked and compact storage.
Platforms tested:
FreeBSD 4.9 (sleipnir) w/parallel
Solaris 2.7 (arabica)
h5committest
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Code optimization
Description:
Expand the use of macros to inline trivial function pointer lookup and
calls to reduce the overall number of functions invoked during normal operation
of the library.
Platforms tested:
Solaris 2.7 (arabica)
FreeBSD 4.9 (sleipnir) w/parallel
Too minor to require h5committest
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Code optimization
Description:
Reduce the number of times the number of elements in a selection is
computed.
Platforms tested:
Solaris 2.7 (arabica)
FreeBSD 4.9 (sleipnir) w/parallel
too minor to require h5committest
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Code cleanup
Description:
Clean up lots of warnings based on those reported from the SGI compilers
as well as gcc.
Platforms tested:
SGI O3900, IRIX64 6.5 (Cheryl's SGI machine)
FreeBSD 4.9 (sleipnir) w/ & w/o parallel
h5committest
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Code cleanup
Description:
Update null dataspace changes to try to write older version of dataspace
information whenever possible.
Refactor common code to only one location.
Allow I/O operations to succeed on null dataspaces.
Platforms tested:
FreeBSD 4.9 (sleipnir)
h5committest
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Code cleanup/optimization
Description:
Query property list values once, at the beginning of the I/O routines,
instead of querying the property list values multiple (lots!) of times in
lower level routines.
Solution:
Create "property list caches" for internal library queries of the property
list values.
Platforms tested:
IBM p690 (copper) w/parallel & fphdf5
h5committest
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bug fix/optimization
Description:
Address slowdown in MPI-I/O file metadata operations that was introduced
mid-stream. We now _require_ a POSIX compliant parallel file system for the
MPI-I/O file driver (as well as for the MPI-POSIX file driver).
Also optimized file open operation when the file is being created by
reducing the number of collective & syncronizing calls.
Additionally, refactor the MPI routines into a common place, eliminating
duplicated code.
Platforms tested:
FreeBSD 4.9 (sleipnir) w/parallel
h5committest
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Code cleanup
Description:
Clean up compiler warnings, especially the 'FUNC' variable not used which
comes out in production mode.
Solution:
Had to add a new FUNC_ENTER_NOAPI_NOINIT_NOFUNC macro for those non-API
functions which don't need the 'FUNC' variable defined. (This will be _so_
much easier when C99 is standard on all our supposed platforms, since it has a
__FUNC__ macro... )
Platforms tested:
FreeBSD 4.9 (sleipnir)
too minor for h5committest (although there were lots of files changed, the
change was minor in each one)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Code cleanup & simplification
Description:
Replace [non-working] code in routine to build up an MPI derived type for
a hyperslab selection that was supposed to "flatten" out contigous blocks with
code that uses a selection iterator (which _does_ have working "flattening"
code).
Remove "contiguous hyperslab" code for MPI selections as it is unnessary
now.
Platforms tested:
FreeBSD 4.9 (sleipnir) w & w/o parallel
Linux 2.4 (verbena) w/FORTRAN & C++
Solaris 2.7 (arabica) w/64-bit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bug fix & code cleanups
Description:
Change our use of MPI derived datatypes to not create datatypes with
"0-sized" lengths, which causes the LANL Q machine to hang.
Also, get rid of "prefer MPI derived datatypes" environment variable since
it has no advantage.
Platforms tested:
FreeBSD 4.9 (sleipnir) w & w/o parallel
h5committest
|
|
|
|
|
|
|
|
|
|
|
| |
Code cleanup
Description:
De-linted more code
Platforms tested:
FreeBSD 4.8 (sleipnir) w/parallel
too minor to require h5committest
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Refactoring code
Description:
Refactored "IS_H5FD_MPIO || IS_H5FD_MPIPOSIX || IS_H5FD_FPHDF5" combination
of macros in many places into single IS_H5FD_MPI macro, which has the same
definition, but should be easier to maintain.
Platforms tested:
FreeBSD 4.8 (sleipnir)
h5committest
|
|
|
|
|
|
|
|
|
|
|
|
| |
Code cleanup
Description:
Clean up warnings exposed by compiling on O2K. Also, revert some of Bill
and my changes to the H5S_mpi_opt_types_g, etc. and settle them back into their
original location.
Platforms tested:
h5committested.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Update
Description:
Updated copyright statement in files which hadn't been updated yet.
Platforms tested:
Linux (Only comment change)
Misc. update:
|
|
|
|
|
|
|
|
|
|
|
|
| |
Code cleanup/new feature.
Description:
Split FUNC_LEAVE into API and non-API specific versions. This allows a
solution to compiling this branch with C++, as well as reducing the size
of the binaries produced.
Platforms tested:
FreeBSD 4.7 (sleipnir) w/serial, parallel (including MPE) & thread-safe
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Lots of performance improvements & a couple new internal API interfaces.
Description:
Performance Improvements:
- Cached file offset & length sizes in shared file struct, to avoid
constantly looking them up in the FCPL.
- Generic property improvements:
- Added "revision" number to generic property classes to speed
up comparisons.
- Changed method of storing properties from using a hash-table
to the TBBT routines in the library.
- Share the propery names between classes and the lists derived
from them.
- Removed redundant 'def_value' buffer from each property.
- Switching code to use a "copy on write" strategy for
properties in each list, where the properties in each list
are shared with the properties in the class, until a
property's value is changed in a list.
- Fixed error in layout code which was allocating too many buffers.
- Redefined public macros of the form (H5open()/H5check, <variable>)
internally to only be (<variable>), avoiding innumerable useless
calls to H5open() and H5check_version().
- Reuse already zeroed buffers in H5F_contig_fill instead of
constantly re-zeroing them.
- Don't write fill values if writing entire dataset.
- Use gettimeofday() system call instead of time() system when
checking the modification time of a dataset.
- Added reference counted string API and use it for tracking the
names of objects opening in a file (for the ID->name code).
- Removed redundant H5P_get() calls in B-tree routines.
- Redefine H5T datatype macros internally to the library, to avoid
calling H5check redundantly.
- Keep dataspace information for dataset locally instead of reading
from disk each time. Added new module to track open objects
in a file, to allow this (which will be useful eventually for
some FPH5 metadata caching issues).
- Remove H5AC_find macro which was inlining metadata cache lookups,
and call function instead.
- Remove redundant memset() calls from H5G_namei() routine.
- Remove redundant checking of object type when locating objects
in metadata cache and rely on the address only.
- Create default dataset object to use when default dataset creation
property list is used to create datasets, bypassing querying
for all the property list values.
- Use default I/O vector size when performing raw data with the
default dataset transfer property list, instead of querying for
I/O vector size.
- Remove H5P_DEFAULT internally to the library, replacing it with
more specific default property list based on the type of
property list needed.
- Remove redundant memset() calls in object header message (H5O*)
routines.
- Remove redunant memset() calls in data I/O routines.
- Split free-list allocation routines into malloc() and calloc()-
like routines, instead of one combined routine.
- Remove lots of indirection in H5O*() routines.
- Simplify metadata cache entry comparison routine (used when
flushing entire cache out).
- Only enable metadata cache statistics when H5AC_DEBUG is turned
on, instead of always tracking them.
- Simplify address comparison macro (H5F_addr_eq).
- Remove redundant metadata cache entry protections during dataset
creation by protecting the object header once and making all
the modifications necessary for the dataset creation before
unprotecting it.
- Reduce # of "number of element in extent" computations performed
by computing and storing the value during dataspace creation.
- Simplify checking for group location's file information, when file
has not been involving in file-mounting operations.
- Use binary encoding for modification time, instead of ASCII.
- Hoist H5HL_peek calls (to get information in a local heap)
out of loops in many group routine.
- Use static variable for iterators of selections, instead of
dynamically allocation them each time.
- Lookup & insert new entries in one step, avoiding traversing
group's B-tree twice.
- Fixed memory leak in H5Gget_objname_idx() routine (tangential to
performance improvements, but fixed along the way).
- Use free-list for reference counted strings.
- Don't bother copying object names into cached group entries,
since they are re-created when an object is opened.
The benchmark I used to measure these results created several thousand
small (2K) datasets in a file and wrote out the data for them. This is
Elena's "regular.c" benchmark.
These changes resulted in approximately ~4.3x speedup of the
development branch when compared to the previous code in the
development branch and ~1.4x speedup compared to the release
branch.
Additionally, these changes reduce the total memory used (code and
data) by the development branch by ~800KB, bringing the development
branch back into the same ballpark as the release branch.
I'll send out a more detailed description of the benchmark results
as a followup note.
New internal API routines:
Added "reference counted strings" API for tracking strings that get
used by multiple owners without duplicating the strings.
Added "ternary search tree" API for text->object mappings.
Platforms tested:
Tested h5committest {arabica (fortran), eirene (fortran, C++)
modi4 (parallel, fortran)}
Other platforms/configurations tested?
FreeBSD 4.7 (sleipnir) serial & parallel
Solaris 2.6 (baldric) serial
|