-
Notifications
You must be signed in to change notification settings - Fork 2
Description
Dear ADA team,
I have noticed that ADA returns a checksum with additional spaces/blanks at the end. This may cause validation errors if not removes (see bash code) below. For viewing the current format returned by ADA is very nice (filename and then one or multiple checksums). For workflows it may be ideal to just have the checksum returned without filename and blanks the output can then be caught and processed without string magic. Ideally ADA would even have the option to let user tell ada whether they would like to have the a32 or md5 checksum returned.
bash script
#!/bin/bash
author: R. Oonk (SURF, FEB 2022)
check directories
echo "check top level directories"
rclone --config=maca_caroline.conf lsd maca_caroline:
task1 copy a local file on spider disk to external dcache disk
echo ""
echo "task1: send local file on spider disk to external dcache disk"
calculate adler32 checksum for file stored locally on spider
echo " calculate adler32 checksum for local file on spider disk"
a32_local_temp=$(./adler32.py earth_disk.jpeg)
remove blanks
a32_local=${a32_local_temp//[[:blank:]]/}
echo " adler32 checksum is: " ${a32_local}
create target directory
echo " create target directory for disk data on external dcache disk"
rclone --config=maca_caroline.conf mkdir maca_caroline:/disk/data
transfer file stored locally on spider to dcache disk
echo " transfer local file on spider disk to external dcache disk"
rclone --config=maca_caroline.conf copy earth_disk.jpeg maca_caroline:/disk/data
retrieve the adler32 checksum for file stored on dcache disk
ada_disk=$(ada --tokenfile maca_caroline.conf --checksum /disk/data/earth_disk.jpeg)
IFS='='
read -a strarr <<< "$ada_disk"
a32_dcache_temp=${strarr[1]}
remove blanks
a32_dcache=${a32_dcache_temp//[[:blank:]]/}
echo " ada retrieved adler32 checksum for file on external dcache disk: ${a32_dcache}"
manual checks on calculated and retrieved values for the checksums stored in the variables
#echo ${a32_local}
#echo ${a32_dcache}
#echo ${a32_local} |awk '{print length}'
#echo ${a32_dcache} |awk '{print length}'
#for (( i=0; i<${#a32_dcache}; i++ )); do
echo "${a32_dcache:$i:1}"
#done
verify local checksum with external dcache disk stored checksum
echo " comparing local spider file checksum with the external dcache disk stored checksum"
if [[ ${a32_local} == ${a32_dcache} ]]; then
echo " Checksums are equal for task 1."
else
echo " Checksums are not equal for task 1."
fi
task2 copy the external dcache file from task1 to local spider disk
echo ""
echo "task2: retrieve file from external dcache disk to local storage on spider disk"
echo " we use the same file as in task1 and hence already have the dcache stored checksum"
echo " ada retrieved adler32 checksum for file on external dcache disk: ${a32_dcache}"
make a new local directory
echo " create new_data directory locally on spider"
mkdir ./new_data
copy file from external dcache disk to local spider disk storage
echo " copy file from external dcache disk to local new_data folder"
note the trailing '/' is important when using copy with rclone
rclone --config=maca_caroline.conf copy maca_caroline:/disk/data/earth_disk.jpeg ./new_data/
alternative (single files)
#rclone --config=maca_caroline.conf copyto maca_caroline:/disk/data/earth_disk.jpeg ./new_data/earth_disk.jpeg
calculate adler32 checksum for file stored locally on spider
echo " calculate adler32 checksum for new local file in new_data on spider disk"
a32_local_new_temp=$(./adler32.py ./new_data/earth_disk.jpeg)
remove blanks
a32_local_new=${a32_local_new_temp//[[:blank:]]/}
echo " local adler32 checksum is (new_data): " ${a32_local_new}
verify local checksum with external dcache disk stored checksum
echo " comparing checksum for local spider file in new_data with the external dcache disk stored checksum"
if [[ ${a32_local_new} == ${a32_dcache} ]]; then
echo " Checksums are equal for task 2."
else
echo " Checksums are not equal for task 2."
fi