summaryrefslogtreecommitdiffstats
path: root/src/H5Dmpio.c
Commit message (Collapse)AuthorAgeFilesLines
* [svn-r12400] Purpose:MuQun Yang2006-06-031-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some collective chunk IO macro names are confusing, change them to more meaningful names. Description: H5Pset_dxpl_mpio_chunk_opt will set a flag so that the library can do one linked IO or multi-chunk IO with collective in chunking storage directly. That is, the library won't do analyses to determine this. The flags for the enum type we used before are: H5FD_MPIO_OPT_ONE_IO H5FD_MPIO_OPT_MULTI_IO They are not good names because of the following two reasons: 1. It doesn't reflect chunking storage 2. OPT is kind of redundant and misleading, Solution: We change the names to H5FD_MPIO_CHUNK_ONE_IO H5FD_MPIO_CHUNK_MULTI_IO Platforms tested: Since only macro names are changed, no need to test with h5committest. heping(mpich 1.2.6) Misc. update:
* [svn-r12343] Purpose:MuQun Yang2006-05-111-16/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Bug fix for collective chunk IO Description: Several options have been provided inside HDF5 library for obtaining chunk addresses when doing collective chunk IO. One option is to obtain chunk addresses for one process and broadcast to other processes. This option needs all processes to participate. If using link-chunked IO without any optimizations, sometimes this is not true due to the random initialization for one variable at mpich 1.2.7. This is a bug inside the collective chunk IO code. Solution: 1.Initalize all the variables to some safe numbers, 2. Avoid using MPI broadcast to obtain the chunk address if possible until more performance studies have been done. 3. Seems okay to obtain chunk addresses individually for each processor. This option may cover most cases. Platforms tested: h5committest(copper is not usable) NCSA teragrid (mpich 1.2.5) mir 64-bit linux (mpich 1.2.6) Misc. update:
* [svn-r12173] Purpose:MuQun Yang2006-03-291-1/+95
| | | | | | | | | | | | | | | | | | | | | | | | | | | Adding parallel tests for optional collective chunk APIs Description: Three new APIs "H5Pset_dxpl_mpio_chunk_opt_ratio H5Pset_dxpl_mpio_chunk_opt_num H5Pset_dxpl_mpio_chunk_opt" for optional optimization choices from users have been added to the libraries. This check-in adds six tests to verify the funcationality and correctedness of these APIs. These tests need to be verified with 3 or more processors and with MPI-IO driver only. Solution: Using H5Pinsert, H5Pget, H5Pset to verify that the library indeed goes into the branch we hope for. Using H5_HAVE_INSTRUMENT macro to isolate these changes so that it won't affect or be misused by the application. Platforms tested: h5committest(shanti still refused to be connected) Parallel tests on heping somehow are skipped. Manually testing at heping. Have checked 1,2,3,4,5 processes. Misc. update:
* [svn-r12123] Purpose:MuQun Yang2006-03-201-12/+14
| | | | | | | | | | | | | | Add more comments Description: Add more comments to H5Dmpio.c, which describes a little bit more about collective IO management. Solution: Platforms tested: Only test at heping since only comments were added. Misc. update:
* [svn-r12117] Purpose:MuQun Yang2006-03-181-38/+25
| | | | | | | | | | | | | | | | | | | | | Enhancing the optimiziation of collective IO per chunk Description: When the user does one of the following two things: 1. to do collective IO per chunk without using our optimization code 2. or the user passes the percent of number of process per chunk to be 0 when choosing to do collective IO per chunk It is not necessary that the library uses MPI-IO collective calls to do any optimization. Solution: Modify the code so that no MPI communication-involved analyses will be done for the above cases. Chunk addresses are obtained globally and IO modes are assigned to collective always. Platforms tested: h5committest Misc. update:
* [svn-r12111] Purpose:MuQun Yang2006-03-171-11/+44
| | | | | | | | | | | | | minor change for collective code Description: Solution: Platforms tested: mir Misc. update:
* [svn-r12090] Purpose:MuQun Yang2006-03-141-24/+30
| | | | | | | | | | | | | | | | | | | New APIs to add for collective chunk IO Description: Three new APIs H5Pset_dxpl_mpio_chunk_opt_ratio H5Pset_dxpl_mpio_chunk_opt_num H5Pset_dxpl_mpio_chunk_opt for optional optimization choices from users. Solution: Haven't added tests yet, won't affect other parts of the library. Will add tests after urgent investigations of memory leaking problems from NASA Aura team. Platforms tested: heping: both parallel and sequential shanti Misc. update:
* [svn-r11964] Purpose:MuQun Yang2006-02-231-6/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | bug fix Description: ret_value is not set to SUCCEED in H5D_mpio_select_write That causes the 64-bit intel compiler unhappy. It will give a non-zero number and cause the testing fake failure. Another one is picked up by cmpi again. For one optimization case, another variable is not initialized properly and compiler set an unhappy number, cause the test failed. Solution: Properly initialize those variables. Platforms tested: teragrid:parallel mir:parallel heping: parallel and sequential tungsten:parallel Misc. update: h5committest doesn't finish due to no space left on device. parallel tests still failed at tungsten with cmpi. It looks like it was a bug from cmpi.
* [svn-r11960] Purpose:MuQun Yang2006-02-211-17/+15
| | | | | | | | | | | | | | code clean up Description: Clean up some warnings in collective chunk IO code Solution: Platforms tested: heping Misc. update:
* [svn-r11955] MuQun Yang2006-02-171-1/+1
| | | | | | | | | | | | | | Purpose: Erase one printf line accidently inserted in the code. Description: Solution: Platforms tested: No need to test. Misc. update:
* [svn-r11950] Purpose:MuQun Yang2006-02-161-129/+1517
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Enhanced collective chunk IO supports Description: 1. When using collective IO with chunking storage without any tuning, performance may become worse under some circumstances. 2. Current HDF5 handles raw-data IO per chunk. So for many small chunks, many small IOs will be passed into MPI-IO. That may cause bad performance. 3. For one IO per chunk case, sometimes performance with collective is worse than performance with independent. An obvious case is when only one process is doing IO and all other process are not doing IO, the collective IO will only add overheads for communication. We want to avoid this case. Some management inside our library needs to be done. Solution: - Added managements of collective IO supports for chunking storage inside parallel HDF5 1) Implemented One IO with collective mode for all chunks in the application by building one MPI derived datatype accross all chunks. 2) Implemented the decision-making support to do collective IO inside MPI-IO per chunk. 3) Added the decision-making support to do one IO accross all chunks or to do multiple IOs with each IO per chunk. 4) Added the support to handle the case some processes won't do any IOs in collectively. 5) Some MPI-IO package(mpich 1.2.6 or lower, e.g.) cannot handle collective IO correctly for the case when some processes have no contributions to IOs, a special macro is added to change collective IO mode to independent IO mode inside HDF5 library. Platforms tested: Parallel: IBM AIX 5.2(copper) Linux (heping) mpich-1.2.6 SDSC Teragrid mpich-1.2.5 Linux(Tungsten) mpich-1.2.6 Altix(NCSA cobalt) Seq: Linux(heping) Misc. update:
* [svn-r11712] Purpose:Quincey Koziol2005-11-151-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | New feature Description: Check in baseline for compact group revisions, which radically revises the source code for managing groups and object headers. WARNING!!!! WARNING!!!! WARNING!!!! WARNING!!!! WARNING!!!! WARNING!!!! WARNING!!!! WARNING!!!! WARNING!!!! WARNING!!!! WARNING!!!! WARNING!!!! This initiates the "unstable" phase of the 1.7.x branch, leading up to the 1.8.0 release. Please test this code, but do _NOT_ keep files created with it - the format will change again before the release and you will not be able to read your old files!!! WARNING!!!! WARNING!!!! WARNING!!!! WARNING!!!! WARNING!!!! WARNING!!!! WARNING!!!! WARNING!!!! WARNING!!!! WARNING!!!! WARNING!!!! WARNING!!!! Solution: There's too many changes to really describe them all, but some of them include: - Stop abusing the H5G_entry_t structure and split it into two separate structures for non-symbol table node use within the library: H5O_loc_t for object locations in a file and H5G_name_t to store the path to an opened object. H5G_entry_t is now only used for storing symbol table entries on disk. - Retire H5G_namei() in favor of a more general mechanism for traversing group paths and issuing callbacks on objects located. This gets us out of the business of hacking H5G_namei() for new features, generally. - Revised H5O* routines to take a H5O_loc_t instead of H5G_entry_t - Lots more... Platforms tested: h5committested and maybe another dozen configurations.... :-)
* [svn-r11593] Purpose:Quincey Koziol2005-10-211-27/+27
| | | | | | | | | | | | Code cleanup Description: Clean up & standardize a bit in preparation for coding standards discussion. Platforms tested: FreeBSD 4.11 (sleipnir) Too minor to require h5committest
* [svn-r11245] Purpose:Quincey Koziol2005-08-131-2/+2
| | | | | | | | | | | | | | | | | | | | Code cleanup Description: Trim trailing whitespace, which is making 'diff'ing the two branches difficult. Solution: Ran this script in each directory: foreach f (*.[ch] *.cpp) sed 's/[[:blank:]]*$//' $f > sed.out && mv sed.out $f end Platforms tested: FreeBSD 4.11 (sleipnir) Too minor to require h5committest
* [svn-r11241] Purpose:Quincey Koziol2005-08-131-22/+60
| | | | | | | | | | | | | Code cleanup Description: Fix logic error in previous checkin and also finish refactoring I/O initialization, including simplifying all the collective & parallel cases into a more unified mechanism. Platforms tested: FreeBSD 4.11 (sleipnir) w/ & w/o parallel Linux 2.4 (mir)
* [svn-r11235] Purpose:Quincey Koziol2005-08-121-17/+11
| | | | | | | | | | | | | Code cleanup Description: Refactor, simplify and cleanup Kent's recent checking for collective chunk I/O. There's a bug that I need to talk to Kent about and some more cleanups still, but this is reasonable for an interim point. Platforms tested: FreeBSD 4.11 (sleipnir) w/parallel Too minor for h5committest
* [svn-r11231] Purpose:MuQun Yang2005-08-111-359/+40
| | | | | | | | | | | | | | | | | | | | | | | bug fix for collective chunk IO, phase 1 Optimization hasn't been done yet, the collective chunk IO bug should be fixed. Description: In chunking storage, memory space and file space will be remapped, So to check whether file space and memory space are regular in order to use optimized MPI derived datatype for collective call one has to check per-chunk wise instead of per hyperslab wise. Even a regular memory space will be stored in span-tree and will be irregular before chunk IO. Solution: 1. Check file space and memory space per chunk wise instead of per hyperslab wise. 2. For collective IO mode, number of chunks covered by hyperslab may be different. Since we are handing per chunk per IO, for the extra chunk IO for some(not all) processors, collective mode will cause program hanged. So for the extra chunk Io mode independent IO has to be used. 3. On some platforms, Complex MPI derived datatype is not working, so we have to use independent IO for collective IO mode if the selection is irregular. However, when the selection is regular, we do want to use collective IO since that will improve performance. Special cares have to be added for this case. Platforms tested: copper(AIX 5.1) Linux(heping mpich 1.2.6), Teragrid machine, Cobalt(altix), modi4 Misc. update:
* [svn-r10628] Purpose:Quincey Koziol2005-04-181-3/+1
| | | | | | | | | | | Code cleanup Description: Clean up various warnings reported by the Windows team. Platforms tested: FreeBSD 4.11 (sleipnir) Too minor to require h5committest
* [svn-r10545] Purpose:MuQun Yang2005-04-051-73/+72
| | | | | | | | | | | | | | | | | | | | | | | | Activating collective IO supports for irregular selction inside HDF5 dataset. This support doesn't include to build the final complicated MPI derived datatype support for chunking storage. Description: Support collective chunk IO for both contiguous and chunking storage for irregular selection( using H5S_SELECT_OR for multiple hyperslab selection) Solution: Using MPI derived datatype to realize this feature. Problems still need to be investigated: Big size irregular hyperslab selection test may cause MPI hang or abnormalexit with MPICH family on various platforms. This is really hard to debug since sometimes it can work and sometimes it cannot work. We will continue investigating those cases. This may not be parallel HDF5 bugs since with the recent version of poe at IBM platforms, all tests passed. Platforms tested: 1. Linux heping 2.4.21-27.0.1 with mpich 2. AIX 5.1 copper with mpcc_r 3. Altix cobalt SGI linux 2.4.21-sgi304rp05031014_10149 with icc -lmpi 4. Linux Cluster SDSC TG, intel 8-r2 with mpich 1.2.5 5. NCSA Linux Cluster Tungsten, MPICH-TCP-1.2.5.2, Intel 8.0 under lustre 6. NCSA Linux Cluster Tungsten, MPICH-LAM-INTEL711, sometimes not working 7. NCSA Linux CLuster Tungsten, champion-pro-1.2.0-3, not working for other collective IO tests, but work for irregular selection collective IO test. Misc. update:
* [svn-r9857] Purpose: MaintenanceElena Pourmal2005-01-221-3/+0
| | | | | | | | | | | | | | Description: Removed PABLO from the source Solution: Platforms tested: arabica with 64-bit, copper with parallel, heping with GNU C and C++ and PGI fortran (but I disabled hl, there is some weird problem only on heping: F9XMODFLAG is not propagated to the Makefile files Misc. update:
* [svn-r9727] Purpose:Quincey Koziol2004-12-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Bug Fix/Code Cleanup/Doc Cleanup/Optimization/Branch Sync :-) Description: Generally speaking, this is the "signed->unsigned" change to selections. However, in the process of merging code back, things got stickier and stickier until I ended up doing a big "sync the two branches up" operation. So... I brought back all the "infrastructure" fixes from the development branch to the release branch (which I think were actually making some improvement in performance) as well as fixed several bugs which had been fixed in one branch, but not the other. I've also tagged the repository before making this checkin with the label "before_signed_unsigned_changes". Platforms tested: FreeBSD 4.10 (sleipnir) w/parallel & fphdf5 FreeBSD 4.10 (sleipnir) w/threadsafe FreeBSD 4.10 (sleipnir) w/backward compatibility Solaris 2.7 (arabica) w/"purify options" Solaris 2.8 (sol) w/FORTRAN & C++ AIX 5.x (copper) w/parallel & FORTRAN IRIX64 6.5 (modi4) w/FORTRAN Linux 2.4 (heping) w/FORTRAN & C++ Misc. update:
* [svn-r9574] Purpose:MuQun Yang2004-11-241-4/+27
| | | | | | | | | | | | | | | Adding code for using MPI derived datatype to handle collective IO Description: No testing yet, won't affect the library. Solution: Platforms tested: linux 2.4 + mpich 1.2.6 Aix 5.1 + mpcc_r Misc. update:
* [svn-r9529] Purpose:MuQun Yang2004-11-151-4/+7
| | | | | | | | | | | | | | | | | | | Check in some new fixes for MPI derived datatype routines Description: MPI derived datatype algorithm seems working for a simple case; however, there are still other problems need to be solved. So the code cannot be used for the time being. Check-in only for debugging. It won't affect other part of the library. Solution: Platforms tested: Linux 2.4 (heping, serial and parallel) (Since no new tests were added and changes are mostly restricted to one fuction, no need to test three platforms). Misc. update:
* [svn-r9519] Purpose:MuQun Yang2004-11-111-0/+189
| | | | | | | | | | | | | | | | | | | | Adding codes for the general MPI derived datatype in order to better incorporate new fixes of HDF5 library. Description: Note: These codes have not been tested for general use. Don't call these functions in your developments of the HDF5 library. Also these codes are stand-alone codes, they should not affect other library codes. Solution: Platforms tested: Heping(C and Parallel linux 2.4, mpich 1.2.6) Arabica(C,C++,Fortran, Solaris 2.7) Copper(C,c++,Fortran, AIX 5.1, NOTE: c++ FAILED, seems not due to the recent check-in) Misc. update:
* [svn-r9358] Purpose:Quincey Koziol2004-10-041-5/+10
| | | | | | | | | | | | | | | Bug fix Description: Relax restrictions on parallel I/O to allow compressed, chunked datasets to be read in parallel (collective access will be degraded to independent access, but will retrieve the information still). Platforms tested: FreeBSD 4.10 (sleipnir) w/parallel Solaris 2.7 (arabica) IRIX64 6.5 (modi4) h5committest
* [svn-r9354] Purpose:Quincey Koziol2004-10-011-2/+173
| | | | | | | | | | | | | | | | Bug fix & code cleanup Description: More dataset cleanups to get to a point where we can fix the chunked I/O bug. Also fix a couple of errors in the recent file object resurrection changes which should hopefully address the recent daily test failres (H5T.c) Platforms tested: FreeBSD 4.10 (sleipnir) w/parallel Solaris 2.7 (arabica) h5committest
* [svn-r9342] Purpose:Quincey Koziol2004-09-301-28/+25
| | | | | | | | | | | | | | | | Bug fix/code cleanup Description: Clean up raw data I/O code to bundle the I/O parameters (dataset, DXPL ID, etc) into a single struct to pass around through the dataset I/O routines, since they are always passed together, until very near the bottom of the I/O stack. Platforms tested: FreeBSD 4.10 (sleipnir) w/parallel Solaris 2.7 (arabica) IRIX64 6.5 (modi4) h5committest
* [svn-r9329] James Laird2004-09-281-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Purpose: Feature Description: Datatypes and groups now use H5FO "file object" code that was previously only used by datasets. These objects will hold a file open if the file is closed but they have not yet been closed. If these objects are unlinked then relinked, they will not be destroyed. If they are opened twice (even by two different names), both IDs will "see" changes made to the object using the other ID. When an object is opened using two different names (e.g., if a dataset was opened under one name, then mounted and opened under its new name), calling H5Iget_name() on a given hid_t will return the name used to open that hid_t, not the current name of the object (this is a feature, and a change from the previous behavior of datasets). Solution: Used H5FO code that was already in place for datasets. Broke H5D_t's, H5T_t's, and H5G_t's into a "shared" struct and a private struct. The shared structs (H5D_shared_t, etc.) hold the object's information and are used by all IDs that point to a given object in the file. The private structs are pointed to by the hid_t and contain the object's group entry information (including its name) and a pointer to the shared struct for that object. This changed the naming of structs throughout the library (e.g., datatype->size is now datatype->shared->size). I added an updated H5Tinit.c to windows.zip. Platforms tested: Visual Studio 7, sleipnir, arabica, verbena Misc. update:
* [svn-r8987] Purpose:Quincey Koziol2004-08-021-4/+3
| | | | | | | | | | | | Code cleanup Description: Fix another batch of minor differences between the development and release branches. Platforms tested: FreeBSD 4.10 (sleipnir) w/parallel Too minor to require h5committest
* [svn-r8981] Purpose:Quincey Koziol2004-08-021-1/+0
| | | | | | | | | | | | | Code cleanup Description: Various minor tweaks to clean code up and bring it into closer syncronization with the release branch. Platforms tested: FreeBSD 4.10 (sleipnir) w/parallel h5committested IRIX64 6.5 (modi4)
* [svn-r8932] Purpose:Quincey Koziol2004-07-221-0/+270
Code cleanup Description: Clean up collective chunking code a bit. Also, add '--enable-instrument' configure flag to have a mechanism for determining that optimized operations happened correctly in the library (instead of just the "normal" way) by allowing 'flag' properties to be set outside the library and set when the "right" thing happens. This is mainly for debugging and regression checks, so we make certain we don't break optimized I/O by accident. It's enabled by default when --enable-debug is on (which is on by default in the development branch and off by default in the release branch), but can also be independently controlled with its own configure flag. Platforms tested: FreeBSD 4.10 (sleipnir) w/parallel IBM p690 (copper) w/parallel