In the below bash processes substitution
, if there are 3 files in a directory /home/cmccabe/medex.logs/analysis.log
, the filename
variable is set to where these files are located.
The code does execute, the problem is that if there is a renamed file in the output directory below, it gets repeated but the new one does not.
So,lets say that there is a file in the output directory that has been renamed to
00-0000_Last-First_fbn1_20xcoverage.txt
when the script executes the second time the output directory looks like:
00-0000_Last-First_fbn1_20xcoverage_Last-First_fbn1_20xcoverage.txt
01-0101_LastN-FirstN_fbn1_20xcoverage.txt
the renamed file in the directory repeats while the new file being renamed is ok.
However if I execute the script a third time the directory looks like:
00-0000_Last-First_fbn1_20xcoverage_Last-First_fbn1_20xcoverage.txt
01-0101_LastN-FirstN_fbn1_20xcoverage_fbn1_20xcoverage.txt
02-0202_La-Fi_fbn1_20xcoverage.txt
I am not sure why the filenames repeat for the previous and do not know how to fix it. Thank you :).
# declare associative array
declare -A mapArray
# Set variable
filename=$(awk 'END{print}' /home/cmccabe/medex.logs/analysis.log)
# Read the file from the 3rd line of the file and create a hash-map
while IFS= read -r line; do
line="$line"
mapArray["${line%_*}"]="$line"
done < <(tail -n +3 /home/cmccabe/Desktop/NGS/API/$filename/analysis.txt)
# construct hash map and rename the text file
for file in *.txt; do
echo "$file" ${mapArray["${file%%_*}"]}"_${file#*_}"
# mv "$file" ${mapArray["${file%%_*}"]}"_${file#*_}"
done
text file in /home/cmccabe/Desktop/percent
- there could be a maximum of 3 files in this directory
00-0000_fbn1_20xcoverage.txt
01-0101_fbn1_20xcoverage.txt
02-0202_fbn1_20xcoverage.txt
text file in /home/cmccabe/Desktop/analysis.txt
status: complete
id names:
00-0000_Last-First
01-0101_LastN-FirstN
02-0202_La-Fi
desired result in /home/cmccabe/Desktop/percent
00-0000_Last-First_fbn1_20xcoverage.txt
01-0101_Last-First_fbn1_20xcoverage.txt
02-0202_Last-First_fbn1_20xcoverage.txt
I think it may be the for file in *.txt; do
line that is causing the repeats. If there is a renamed file in the directory and a new one is executed then the old file gets renamed as well as the new file.
So if there is a renamed file in the directory
00-0000_Last-First_fbn1_20xcoverage.txt
already and a new file gets renamed to 01-0101_Last-First_fbn1_20xcoverage.txt
, the original file is repeated in because of that for
. I think that is he issue but not sure how to fix it. Thank you :).
I added line="$line"
under the while
and using set -x
, can see the original file as well as the new file are read into line, but not sure how to only process the new one.
+ read -r line
+ line=01-0101_LastN-FirstN --- not part of set -x but this is the original file that is already renamed
+ mapArray["${line%_*}"]=01-0101_LastN-FirstN
+ IFS=
+ read -r line
+ line=02-0202_La-Fi --- not part of set -x but this is the new file to be renamed
+ mapArray["${line%_*}"]=02-0202_La-Fi
+ IFS=
+ read -r line
+ for file in '*.txt'
+ mv 00-0000_Last-First_fbn1.txt 00-0000_Last-First_Last-First_fbn1.txt
+ for file in '*.txt'
+ mv 02-0202_fbn1.txt 02-0202_La-Fi_fbn1.txt