MI - Fimex
|
The Fimex library as of version 0.56 can be used with forked processes. It requires a fork system-call as provided by Unix/Linux environments. Fimex processes can be forked just before the data-fetching and achieves very good scaling for reading data. An example on how to use a getDataSlice with pre-forking can be seen under: share/doc/examples/parallelRead.cpp in Examples
The Fimex library can be used in threaded environments. Fimex objects are generally not thread-safe, so every object should only be used from a single thread. But several threads can create their own Fimex objects.
In addition, all CDMReader::get*Data*() operations are thread-safe and the following code will work nicely:
size_t unlimSlices = unLimDim->getLength(); #pragma omp parallel for default(shared) { for (size_t i = 0; i < unlimSlices; ++i) { try { doSomething(reader->getDataSlice(varName, i)); } catch (...) {} } }
Fimex can be build with parallelization support with OpenMP with the –enable-openmp flag of configure
. The following code-parts are currently (0.35) parallelized:
Often, the performance is limited by the IO-system.
On the fimex-commandline, the number of threads can be set using:
When using the library, one should use:
To get MPI to work, the following prerequisites have to be met:
fimex can then be called with ''mpiexec -n 8 fimex'' and will use parallel MPI-IO to write the netcdf-files with the following CAVEATS:
Performance reading a 11GB compressed netcdf4 file from a 16 core 32threads 2.6GHz machine connected to a lustre parallel filesystem:
nproc time [s] factor 1 158.7 2 79.2 2 4 52.2 1.5 8 29.0 1.8 16 19.4 1.5 32 21.5 0.9
Reading 11GB compressed netcdf4 file and writing the same as uncompressed 37GB netcdf4 file.
nproc time [s] factor 1 232.0 2 147.6 1.6 4 116.9 1.3 8 99.4 1.2 16 104.0 0.9 32 119.6 0.8
Using other compute-intensive data-manipulations will usually improve the scaling.