test/SWMR_UseCase_UG.txt


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88

Title:
    User Guide for SWMR Use Case Programs

Purpose:
    This is a User Guide of the SWMR Use Case programs. It descibes the use
    case program and explain how to run them.

Author and Dates:
    Version 1: By Albert Cheng (acheng@hdfgroup.org), 2013/06/01.

Use Case [1.7]:
    Appending a single chunk

Program name:
    use_append_chunk

Description:
    Appending a single chunk of raw data to a dataset along an unlimited
    dimension within a pre-created file and reading the new data back.

    It first creates one 3d dataset using chunked storage, each chunk
    is a (1, chunksize, chunksize) square.  The dataset is (unlimited,
    chunksize, chunksize). Data type is 2 bytes integer.  It starts out
    "empty", i.e., first dimension is 0.

    The writer then appends planes, each of (1,chunksize,chunksize)
    to the dataset. Fills each plan with plane number and then writes
    it at the nth plane. Increases the plane number and repeats till
    the end of dataset, when it reaches chunksize long. End product is
    a chunksize^3 cube.

    The reader is a separated process, running in parallel with
    the writer.  It reads planes from the dataset.  It expects the
    dataset is being changed (growing). It checks the unlimited dimension
    (dimension[0]). When it increases, it will read in the new planes, one
    by one, and verify the data correctness.  (The nth plan should contain
    all "n".) When the unlimited dimension grows to the chunksize (it
    becomes a cube), that is the expected end of data, the reader exits.

How to run the program:
    Simplest way is
    $ use_append_chunk

    It creates a skeleton dataset (0,254,254) of shorts. Then fork off a
    process, which becomes the reader process to read planes from the dataset,
    while the original process continues as the writer process to append
    planes onto the dataset.

    Other possible options:

    1. -z option: different chunksize. Default is 256.
    $ use_append_chunk -z 1024

    It uses (1,1024,1024) chunks to produce a 1024^3 cube, about 2GB big.


    2. -f filename: different dataset file name
    $ use_append_chunk -f /gpfs/tmp/append_data.h5

    The data file is /gpfs/tmp/append_data.h5. This allows two independent
    processes in separated compute nodes to access the datafile on the shared
    /gpfs file system.


    3. -l option: launch only the reader or writer process.
    $ use_append_chunk -f /gpfs/tmp/append_data.h5 -l w   # in node X
    $ use_append_chunk -f /gpfs/tmp/append_data.h5 -l r   # in node Y

    In node X, launch the writer process, which creates the data file
    and appends to it.
    In node Y, launch the read process to read the data file.
    Note that you need to time the read process to start AFTER the write
    process has created the skeleton data file. Otherwise, the reader will encounter errors such as data file not found.


    4. -s option: use SWMR file access mode or not. Default is yes.
    $ use_append_chunk -s 0

    It opens the HDF5 data file without the SWMR access mode (0 means
    off). This likely will result in error. This option is provided for
    users to see the effect of the neede SWMR access mode for concurrent
    access.

Test Shell Script:
    The Use Case program is installed in the test/ directory and is compiled
    as part of the make process. A test script (test_usecases.sh) is installed
    in the same directory to test the use case programs. The test script is
    rather basic and is more for demonstrating how to use the program.