I am trying to create an output file new
that contains only the S5-00580
lines from list
that are not in analysis_log
. My attempt to do this is below.
The new
file would be used in the aria2c
command to download only new folders. The aria2c
command works to download all the files in list
, but if they already exist in analysis_log
then those lines can be skipped.
Also, all S5-00580-17-Medexome
lines are used that text is there and I can not figure out how to ignore lines that have a keyword in them test
.... basically exclude all lines that do not end with Medexome.tar.bz2
. Thank you :).
diff -u list analysis_log | sed -nr 's/^+([^S5-].*)/\1/p' > new
list
http://xxx.xx.xxx.xxx/output/Home/Auto_user_S5-00580-19-Medexome_122_059/plugin_out/FileExporter_out.137/R_2016_12_09_14_01_11_user_S5-00580-19-Medexome.tar.bz2
http://xxx.xx.xxx.xxx/output/Home/Auto_user_S5-00580-18-Medexome_121_057/plugin_out/FileExporter_out.134/R_2016_12_09_11_18_52_user_S5-00580-18-Medexome.tar.bz2
http://xxx.xx.xxx.xxx/output/Home/Auto_S5-00580-17-Medexome_5224_9680c70_120_056/plugin_out/FileExporter_out.125/R_2016_12_07_12_25_50_S5-00580-17-Medexome_5224_9680c70.tar.bz2
http://xxx.xx.xxx.xxx/output/Home/Auto_user_S5-00580-17-Medexome_119_054/plugin_out/FileExporter_out.122/R_2016_12_05_13_30_48_user_S5-00580-17-Medexome.tar.bz2
http://xxx.xx.xxx.xxx/output/Home/Auto_user_S5-00580-16-Medexome_118_052/plugin_out/FileExporter_out.119/R_2016_12_05_10_45_37_user_S5-00580-16-Medexome.tar.bz2
analysis_log
R_2016_11_18_10_45_10_user_S5-00580-17-Medexome
R_2016_11_18_13_19_32_user_S5-00580-16-Medexome
# verify new files with list call
line_no=$(awk '{x++} END {print x}' /home/cmccabe/s5_files/downloads/new) # count new files and store as variable
if [[ -s /home/cmccabe/s5_files/downloads/new ]]; then
echo "starting download of $line_no new S5 sequencing run"
else
echo " no new files to analyze, goodbye "
exit 1
fi
# download all from list
while read new; do
echo $new
aria2c -x8 -l /home/cmccabe/log.txt -c -d /home/cmccabe/Desktop/NGS/API --use-head=true --http-user "xxxx" --http-passwd xxxx "$new"
done < /home/cmccabe/s5_files/downloads/new
rm /home/cmccabe/s5_files/downloads/list
rm /home/cmccabe/s5_files/downloads/new
desired output of new only these two lines are printed because the S5-00580 was not in the analysis_log
http://xxx.xx.xxx.xxx/output/Home/Auto_user_S5-00580-19-Medexome_122_059/plugin_out/FileExporter_out.137/R_2016_12_09_14_01_11_user_S5-00580-19-Medexome.tar.bz2
http://xxx.xx.xxx.xxx/output/Home/Auto_user_S5-00580-18-Medexome_121_057/plugin_out/FileExporter_out.134/R_2016_12_09_11_18_52_user_S5-00580-18-Medexome.tar.bz2