I am trying to extract the number in bold (leading zero removed) after Medexome_xx_numbertoextract
in file
and create an output
using that extracted number. In the output
the on thing that will change is the number the other test is static and will be the same each time. Thank you :).
file
http://xxx.xx.xxx.xx/output/Home/Auto_user_S5-00580-6-Medexome_67_032/plugin_out/FileExporter_out.67/R_2016_09_20_10_12_41_user_S5-00580-6-Medexome.tar.bz2
http://xxx.xx.xxx.xx/output/Home/Auto_user_S5-00580-4-Medexome_65_028/plugin_out/FileExporter_out.52/R_2016_09_01_10_24_52_user_S5-00580-4-Medexome.tar.bz2
desired output
http://xxx.xx.xxx.xx/report/latex/32.pdf
http://xxx.xx.xxx.xx/report/latex/28.pdf
awk
awk {
A[Q]=substr($0,RSTART,RLENGTH);
next
}
print "http://xxx.xx.xxx.xx/report/latex/"A[substr($0,RSTART,RLENGTH)]"$0".pdf";
delete A[substr($0,RSTART,RLENGTH)]
}' file
Hello cmccabe,
If you have each time exactly the same Input_file text then following may help you in same.
awk '{match($0,/.*\/output/);VAL=substr($0,RSTART,RLENGTH);match($0,/Auto.*_[0-9]+\//);VAL1=substr($0,RSTART,RLENGTH);gsub(/.*_0|.*_|\//,X,VAL1);print VAL"/report/latex/" VAL1".pdf"}' Input_file
Output will be as follows.
http://xxx.xx.xxx.xx/output/report/latex/32.pdf
http://xxx.xx.xxx.xx/output/report/latex/28.pdf
Thanks,
R. Singh
1 Like
awk -F'/' '
{
n=split($6,a,"_")
pdf=a[n]+0
print $1"//"$3 "/report/latex/" pdf ".pdf"
}' myFile
1 Like
A very short script :
$awk -F'[/_]' -vOFS=/ '{$10=$10+0 ;print "http:","",$3,"report/latex",$10 ".pdf" }' urls.txt
http://xxx.xx.xxx.xx/report/latex/32.pdf
http://xxx.xx.xxx.xx/report/latex/28.pdf
$
1 Like
With any POSIX-conforming shell, you can do this just using shell variable expansions without needing to invoke awk
:
while IFS= read -r url
do head=${url%%/output/*}/report/latex/
number=${url%%/plugin*}
number=${number##*_}
number=${number#0}
number=${number#0}
printf '%s%s.pdf\n' "$head" "$number"
done < file
which, if file contains:
http://xxx.xx.xxx.xx/output/Home/Auto_user_S5-00580-6-Medexome_67_032/plugin_out/FileExporter_out.67/R_2016_09_20_10_12_41_user_S5-00580-6-Medexome.tar.bz2
http://xxx.xx.xxx.xx/output/Home/Auto_user_S5-00580-4-Medexome_65_028/plugin_out/FileExporter_out.52/R_2016_09_01_10_24_52_user_S5-00580-4-Medexome.tar.bz2
http://xxx.xx.xxx.xx/output/Home/Auto_user_S5-00580-4-Medexome_65_728/plugin_out/FileExporter_out.52/R_2016_09_01_10_24_52_user_S5-00580-4-Medexome.tar.bz2
http://xxx.xx.xxx.xx/output/Home/Auto_user_S5-00580-4-Medexome_65_008/plugin_out/FileExporter_out.52/R_2016_09_01_10_24_52_user_S5-00580-4-Medexome.tar.bz2
http://xxx.xx.xxx.xx/output/Home/Auto_user_S5-00580-4-Medexome_65_000/plugin_out/FileExporter_out.52/R_2016_09_01_10_24_52_user_S5-00580-4-Medexome.tar.bz2
produces the output:
http://xxx.xx.xxx.xx/report/latex/32.pdf
http://xxx.xx.xxx.xx/report/latex/28.pdf
http://xxx.xx.xxx.xx/report/latex/728.pdf
http://xxx.xx.xxx.xx/report/latex/8.pdf
http://xxx.xx.xxx.xx/report/latex/0.pdf
1 Like
Hi,
For fun with sed (work with example input):
If url source as url destination:
sed -e 's/^\(\([^/]*\/\)\{3\}\).*_0*\([0-9]\+\)\/.*/\1report\/latex\/\3.pdf/' file
If url source not as url destination:
sed -e 's/^.*_0*\([0-9]\+\)\/.*/http:\/\/xxx.xx.xxx.xx\/report\/latex\/\1.pdf/' file
Regards.
1 Like
In case, the format isn't so much fixed, you could try something like this:
awk -F'/plug.*|/outp|_' '{print $1 "/report/latex/" $(NF-1)+0 ".pdf"}' file
--
blastit.fr:
A very short script :
$awk -F'[/_]' -vOFS=/ '{$10=$10+0 ;print "http:","",$3,"report/latex",$10 ".pdf" }' urls.txt
http://xxx.xx.xxx.xx/report/latex/32.pdf
http://xxx.xx.xxx.xx/report/latex/28.pdf
$
Yet, it could be reduced a little bit further still ... :
awk -F'[/_]' '{print "http://" $3 "/report/latex", $10+0 ".pdf"}' file
1 Like
cmccabe
October 29, 2016, 12:00pm
8
Thank you for the help and explanations
scrutinizer:
Yet, it could be reduced a little bit further still ... :
awk -F'[/_]' '{print "http://" $3 "/report/latex", $10+0 ".pdf"}' file
Not work fine, to correct by delete comma (in red)
otherwise, it could be further reduced a little bit :
awk -F'[/_]' '$0="http://"$3"/report/latex/"$10+0".pdf"' file
Regards.
1 Like