Compiling & Building c++ application with Hdf5

Posted on April 6, 2019April 6, 2019 by Giorgos Kourakos

As I found surprisingly difficult and confusing to build and compile C++ applications with the hdf5 library, I decided to post a small guide on how I achieve that.

Building hdf5

Hdf5 provides a number of download options. The one I used is the cmake version
CMake-hdf5-1.10.5.tar.gz.

After extracting there is a shell script under the main folder build-unix.sh .
The building process under Ubuntu required just that one line.
This will create a build directory inside the main folder, whereas the build directory has several folders.

Using hdf5

To use the cmake version we need a CMakeLists.txt file.

A template of such file is included in here. This file and more information can be also found under the Building HDF5 with CMake guide.

Starting from that template and after spending many hours searching I have modified the cmake as follows: to make it work for C++

cmake_minimum_required (VERSION 3.10.1)
project( myFirstHdf5 C CXX )

set (LIB_TYPE STATIC) # or SHARED
string(TOLOWER ${LIB_TYPE} SEARCH_TYPE)

#find_package (HDF5 NAMES hdf5 COMPONENTS C CXX ${SEARCH_TYPE})
#find_package(HDF5 COMPONENTS CXX HL REQUIRED)
find_package(HDF5 COMPONENTS C CXX HL REQUIRED)

link_directories( ${HDF5_LIBRARY_DIRS} )

include_directories (${HDF5_INCLUDE_DIR})
set (LINK_LIBS ${LINK_LIBS} ${HDF5_C_${LIB_TYPE}_LIBRARY})

#set (example hdfcompile)

add_executable (myFirstHdf5 myFirstHdf5.cpp)

target_link_libraries (myFirstHdf5 ${HDF5_CXX_LIBRARIES})

cmake_minimum_required (VERSION 3.10.1)

project( myFirstHdf5 C CXX )

set (LIB_TYPE STATIC) # or SHARED

string(TOLOWER ${LIB_TYPE} SEARCH_TYPE)

#find_package (HDF5 NAMES hdf5 COMPONENTS C CXX ${SEARCH_TYPE})

#find_package(HDF5 COMPONENTS CXX HL REQUIRED)

find_package(HDF5 COMPONENTS C CXX HL REQUIRED)

link_directories( ${HDF5_LIBRARY_DIRS} )

include_directories (${HDF5_INCLUDE_DIR})

set (LINK_LIBS ${LINK_LIBS} ${HDF5_C_${LIB_TYPE}_LIBRARY})

#set (example hdfcompile)

add_executable (myFirstHdf5 myFirstHdf5.cpp)

target_link_libraries (myFirstHdf5 ${HDF5_CXX_LIBRARIES})

Put the above CMakeLists.txt file in the same folder along with the *.cpp file, which in the example is named myFirstHdf5.cpp.

Next, to run cmake you need to pass at the minimum the -G option which is the easiest and the HDF5_DIR, which was quite hard to find it. A small hint on how to find that is that the HDF5_DIR should contain the file hdf5-config.cmake. In my case, I found this file in a few places and just picked one and luckily worked.

Here is the full cmake command to build a debug version

cmake -G "Unix Makefiles" 
-DCMAKE_BUILD_TYPE=Debug 
-DHDF5_DIR=${HOME}/Downloads/CMake-hdf5-1.10.5/build/_CPack_Packages/Linux/TGZ/HDF5-1.10.5-Linux/HDF_Group/HDF5/1.10.5/share/cmake/hdf5 .

cmake -G "Unix Makefiles"

-DCMAKE_BUILD_TYPE=Debug

-DHDF5_DIR=${HOME}/Downloads/CMake-hdf5-1.10.5/build/_CPack_Packages/Linux/TGZ/HDF5-1.10.5-Linux/HDF_Group/HDF5/1.10.5/share/cmake/hdf5 .

I hope this will save a bit of your time if you ever stumble on this

Programming Tips

Posted on April 11, 2016April 11, 2016 by Giorgos Kourakos

Here you can find useful tips that involve writing any sort of code from simple matlab scripts to advanced C++ object oriented code

Run C2Vsim on Cluster

Posted on October 24, 2015April 11, 2016 by Giorgos Kourakos

Here I want to show how to run C2Vsim on the Cluster at UCDavis. The cluster OS is Ubuntu and uses SLURM manager for job submission.

To submit a job we have to create shell script, for example the file Run_c2vsim_job.sbatch.

The first part of the script contains options for the SLURM. Below is a small list of options I use to run the jobs

#!/bin/bash
#
# job name:
#SBATCH --job-name='C2VSIM'
#
# Number of tasks (cores):
#
#SBATCH --ntasks=1
#
#SBATCH --array=1-100
#
#SBATCH --output=outp%j.log
#SBATCH --error=outp%j.err
#

#!/bin/bash

# job name:

#SBATCH --job-name='C2VSIM'

# Number of tasks (cores):

#SBATCH --ntasks=1

#SBATCH --array=1-100

#SBATCH --output=outp%j.log

#SBATCH --error=outp%j.err

The first option #SBATCH –job-name=’C2VSIM’ defines a name for the job to be submitted.

The second option #SBATCH –ntasks=1 defines the number of cores we need for the job. However IWFM does not support multi-core or multi-thread simulations therefore this option should be 1. However if the simulation requires more memory than the memory allocated by 1 core then we may ask for more cores to increase the memory.

The next option #SBATCH –array=1-100 is used for array jobs. This is used to run repeated simulations with different inputs. This is ideal for Monte Carlo simulations. This option will run the simulation 100 times.

The options #SBATCH –output=outp%j.log and #SBATCH –error=outp%j.err defines the names of the log and error files. For each simulation there would be two files outpXXXXXX.log andoutpXXXXXX.err. The console output is printed on *.log and any errors are printed in *.err file. The XXXXXXX gets its values from the job and array IDs

# Load your modules
#
#
# Set up your environment
cd /home/giorgk/C2VSIM/Finegrid

# Load your modules

# Set up your environment

cd /home/giorgk/C2VSIM/Finegrid

Next we load any module if that’s required. To run IWFM there is no need to load any module and finally we enter into the working directory.

As it was mention in a previous post under linux we need to have all input and output files under the same directory. However this would cause a problem when we try to run the simulation multiple times simultaneously. To do so we need to run each simulation in a separate folder. Doing this manually is quite tedious. To tackle this I use the following workflow: I have created a folder which contains all original C2Vsim input directories and files. Then I copy them to a folder unique for each simulation

wrksp_folder=temp'_'$SLURM_ARRAY_JOB_ID'_'$SLURM_ARRAY_TASK_ID
#
echo "------Simulation started $(date)"
#
cp -r Clean_Input_files $wrksp_folder

wrksp_folder=temp'_'$SLURM_ARRAY_JOB_ID'_'$SLURM_ARRAY_TASK_ID

echo "------Simulation started $(date)"

cp -r Clean_Input_files $wrksp_folder

The above snippet defines a variable wrksp_folder as temp_id1_id2 where temp can be any user defined string, id1 is the job id given by the SLURM and id2 spans from 1 to number of array jobs. The last command copies the input files from the folder Clean_Input_files to the unique for each simulation folder.

Next we enter into this folder and run the simulation. (Refer to this post where the content of the Run_C2VSim.sh file is explained)

cd $wrksp_folder
./Run_C2VSim.sh

1 2	cd $wrksp_folder ./Run_C2VSim.sh

After the simulation is finished we have to copy the results in a safe place and clean the temporary files.

res_folder=results'_'$SLURM_ARRAY_JOB_ID'_'$SLURM_ARRAY_TASK_ID
cp -r Results/ ../Results/$res_folder
cd ..
#
rm -r $wrksp_folder
echo "------Simulation Finished $(date)"

res_folder=results'_'$SLURM_ARRAY_JOB_ID'_'$SLURM_ARRAY_TASK_ID

cp -r Results/ ../Results/$res_folder

cd ..

rm -r $wrksp_folder

echo "------Simulation Finished $(date)"

Similar to working folder we define a unique name for the results folder. Next we copy the content of the $wrksp_folder/Results folder somewhere outside of the $wrksp_folder/ (It may seem confusing but in the second line the Results/ and ../Results/ are two totally different directories)

After the results are copied we delete the $wrksp_folder and print the time when the simulation is finished.

To submit this job on the cluster execute

sbatch Run_c2vsim_job.sbatch

1	sbatch Run_c2vsim_job.sbatch

However for the farm cluster at UCDavis this is not going to work. Actually you need to run this with the following option

sbatch -p serial Run_c2vsim_job.sbatch

1	sbatch -p serial Run_c2vsim_job.sbatch

Subsurface

George Kourakos

Programming Tips

Compiling & Building c++ application with Hdf5

Building hdf5

Using hdf5

Programming Tips

Run C2Vsim on Cluster