Update the SLURM script for the new Integration cluster environment#67
Update the SLURM script for the new Integration cluster environment#67zhengtaocui wants to merge 9 commits intongwpc-candidatefrom
Conversation
There was a problem hiding this comment.
RAW_DOWNLOAD_DIR is not defined so this command fails.
There was a problem hiding this comment.
It is defined in the config file, schism_calib.cfg.
There was a problem hiding this comment.
I see. In the version that I had and was using to test this PR, it was not defined. Please add a comment before sourcing the config file about the expected vars to avoid such issues.
There was a problem hiding this comment.
That configure file is created by the SFINCS config and download script. It should not be here. It causes confusions. I deleted it from the repo.
There was a problem hiding this comment.
This fails because FCST_LENGTH_HRS is set as a float in the config file, and bash arithmetic only supports integers. You either need to use export FCST_LENGTH_HRS=3 in config file, or bc like so end_itime=$(echo "$start_itime + $FCST_LENGTH_HRS * 3600 + 3600" | bc | cut -d. -f1)
There was a problem hiding this comment.
It is not float. This is my config for example.
export STARTPDY=20120611
export STARTCYC=00
export FCST_LENGTH_HRS=3
export HOT_START_FILE=''
export USE_TPXO="NO"
export COASTAL_DOMAIN=hawaii
export METEO_SOURCE=NWM_RETRO
export COASTAL_WORK_DIR=/ngen-test/coastal/schism_config_test_hawaii_stofs_nwm_retro/schism_2012-06-11T00-00-00Z
export RAW_DOWNLOAD_DIR=/ngen-test/coastal/schism_config_test_hawaii_stofs_nwm_retro/data/raw
There was a problem hiding this comment.
In the version that I was testing this PR with, it was a float. So, to avoid this issue it's better to use bc.
There was a problem hiding this comment.
Even in the branch that you submitted this PR, the config file located here has the same issues (No raw directory and float hours). Please modify the config file so it can be a part of this PR. Also, COASTAL_WORK_DIR needs to be user specific to avoid write permission issue. If I run this, I will get a permission error. So, it needs to be something like this: export COASTAL_WORK_DIR=/ngen-test/coastal/${USER}/schism_config_test_hawaii_stofs_nwm_retro/schism_2012-06-11T00-00-00Z.
There was a problem hiding this comment.
STOFS is not available prior to 2020-12-30 as it's noted here. So, I don't think this config will work.
There was a problem hiding this comment.
Right. This file doesn't work. You need to run the SFINCS config and download script first to create all the download files and the config file. This config file doesn't belong here. I removed it.
There was a problem hiding this comment.
I would still need a config file that I can test the script and make sure it runs.
There was a problem hiding this comment.
Same issue as before with the bash arithmetic, this needs to use bc too: itime=$(echo "${LENGTH_HRS} * 3600 + $start_timestamp" | bc | cut -d. -f1)
| #SBATCH --ntasks-per-node=18 #numebr of cores per node | ||
| #SBATCH --exclusive | ||
|
|
||
| export NODES=2 #this must match the number of nodes defined above by slurm | ||
| export NCORES=18 #this must match the number of cores per node defined above by slurm | ||
| export NPROCS=$((NODES*NCORES)) | ||
|
|
||
| set -x | ||
| set -euox pipefail | ||
|
|
||
| #load the configuration file | ||
| . ./schism_calib.cfg |
There was a problem hiding this comment.
Please add this check for env vars to ensure all the required vars are set:
# Check string variables
for var in STARTPDY STARTCYC COASTAL_DOMAIN METEO_SOURCE COASTAL_WORK_DIR RAW_DOWNLOAD_DIR; do
if [[ -z "${!var}" ]]; then
echo "ERROR: $var is not defined in config file"
exit 1
fi
done
# Check numeric variables
for var in FCST_LENGTH_HRS; do
if [[ -z "${!var}" ]] || ! [[ "${!var}" =~ ^[0-9]+$ ]]; then
echo "ERROR: $var must be a positive integer"
exit 1
fi
done
# Check YES/NO variables
if [[ "${USE_TPXO}" != "YES" ]] && [[ "${USE_TPXO}" != "NO" ]]; then
echo "ERROR: USE_TPXO must be YES or NO"
exit 1
fi
# Check that HOT_START_FILE is defined (but allow empty string)
if [[ ! -v HOT_START_FILE ]]; then
echo "ERROR: HOT_START_FILE must be defined (can be empty string '')"
exit 1
fi
# Validate HOT_START_FILE exists if provided
if [[ -n "${HOT_START_FILE}" ]] && [[ ! -f "${HOT_START_FILE}" ]]; then
echo "ERROR: HOT_START_FILE specified but file does not exist: ${HOT_START_FILE}"
exit 1
fiThere was a problem hiding this comment.
Pushed the changes. Thanks for the comments.
|
Thanks for addressing my comments. This looks good to me. |
The SLURM script, sing_run.bash, is updated to adapt the new environment on the Integration cluster.