I have a file
file_name1=RYK11603_PLK5692601_RKYADAV.PDF
i am using the below command to convert this file to RYK11603_5692601.pdf
file_name=$(echo ${file_name1}| cut -d"#" -f2| sed "s/\([!-~]*\)_PLK\([0-9]*\)_\([!-~]*\).PDF/\1_\2.pdf/")
but no success can somebody help on thi.
I tried your command and it seems to work for your input:
$ file_name1=RYK11603_PLK5692601_RKYADAV.PDF
$ printf "%s\n" "${file_name1}"| cut -d"#" -f2| sed "s/\([!-~]*\)_PLK\([0-9]*\)_\([!-~]*\).PDF/\1_\2.pdf/"
RYK11603_5692601.pdf
1 Like
anbu23
March 20, 2013, 7:03am
3
$ echo ${file_name1}| cut -d"#" -f2| sed "s/\([^_]*\)_PLK\([0-9]*\).*PDF/\1_\2.pdf/"
RYK11603_5692601.pdf
1 Like
panyam
March 20, 2013, 7:06am
4
Something like this:
echo "RYK11603_PLK5692601_RKYADAV.PDF" | sed 's/\([A-Z0-9].*\)_[A-Z]*\([0-9].*\)_.*/\1_\2.pdf/'
Scrutinizer: Even for me the code provided by "yadavricky" does not gave the proper output..
I am on Gnu/Linux.
$ file_name1=RYK11603_PLK5692601_RKYADAV.PDF
$ printf "%s\n" "${file_name1}"| cut -d"#" -f2| sed "s/\([!-~]*\)_PLK\([0-9]*\)_\([!-~]*\).PDF/\1_\2.pdf/"
RYK11603_PLK5692601_RKYADAV.PDF
1 Like
panyam:
Something like this:
echo "RYK11603_PLK5692601_RKYADAV.PDF" | sed 's/\([A-Z0-9].*\)_[A-Z]*\([0-9].*\)_.*/\1_\2.pdf/'
Scrutinizer: Even for me the code provided by "yadavricky" does not gave the proper output..
I am on Gnu/Linux.
$ file_name1=RYK11603_PLK5692601_RKYADAV.PDF
$ printf "%s\n" "${file_name1}"| cut -d"#" -f2| sed "s/\([!-~]*\)_PLK\([0-9]*\)_\([!-~]*\).PDF/\1_\2.pdf/"
RYK11603_PLK5692601_RKYADAV.PDF
sorry I forgot to update in post
i will be using solaris, ksh
and the file would be like
file_name1=pk#RYK11603_PLK5692601_RKYADAV.PDF
when i run the suggested code i get
ryk11603_5692601.pdf
@panyam , with GNU sed, I get:
$ printf "%s\n" "${file_name1}"| cut -d"#" -f2| gsed "s/\([!-~]*\)_PLK\([0-9]*\)_\([!-~]*\).PDF/\1_\2.pdf/"
RYK11603_5692601.pdf
must be down to GNU sed versions then.
--edit---
Never mind it is of course a range of characters....
What happens when you replace the !
with ^
which is the proper negation operator in regex?
printf "%s\n" "${file_name1}"| cut -d"#" -f2| sed "s/\([^-~]*\)_PLK\([0-9]*\)_\([^-~]*\).PDF/\1_\2.pdf/"
1 Like
scrutinizer:
@panyam , with GNU sed, I get:
$ printf "%s\n" "${file_name1}"| cut -d"#" -f2| gsed "s/\([!-~]*\)_PLK\([0-9]*\)_\([!-~]*\).PDF/\1_\2.pdf/"
RYK11603_5692601.pdf
must be down to GNU sed versions then.
What happens when you replace the !
with ^
?
printf "%s\n" "${file_name1}"| cut -d"#" -f2| sed "s/\([^-~]*\)_PLK\([0-9]*\)_\([^-~]*\).PDF/\1_\2.pdf/"
@Scrutinizer Thanks but it did not work in bash and ksg on solaris..
it gives output something as below
ryk11603_PLK5692601_RKYADAV.PDF
---------- Post updated at 06:39 AM ---------- Previous update was at 06:31 AM ----------
panyam:
Something like this:
echo "RYK11603_PLK5692601_RKYADAV.PDF" | sed 's/\([A-Z0-9].*\)_[A-Z]*\([0-9].*\)_.*/\1_\2.pdf/'
Scrutinizer: Even for me the code provided by "yadavricky" does not gave the proper output..
I am on Gnu/Linux.
$ file_name1=RYK11603_PLK5692601_RKYADAV.PDF
$ printf "%s\n" "${file_name1}"| cut -d"#" -f2| sed "s/\([!-~]*\)_PLK\([0-9]*\)_\([!-~]*\).PDF/\1_\2.pdf/"
RYK11603_PLK5692601_RKYADAV.PDF
Ok Sorry another condition...
cosider the file it may be like this also
ryk11603_PLK5692601_RKYADAV.PDF
now what should be approach
---------- Post updated at 06:46 AM ---------- Previous update was at 06:39 AM ----------
Dear Anbu,
what if the file name would be as below
ryk11603_PLK5692601_RKYADAV.PDF
and we look for
RYK116038_5692601.pdf
Please suggest
anbu23
March 20, 2013, 7:57am
8
$ file_name1=ryk11603_PLK5692601_RKYADAV.PDF
$ echo ${file_name1}| cut -d"#" -f2 | tr [:lower:] [:upper:] | sed "s/\([^_]*\)_PLK\([0-9]*\).*PDF/\1_\2.pdf/"
RYK11603_5692601.pdf
1 Like
file_name1=RYK11603_PLK5692601_RKYADAV_RHGT_GHTH.PDF
What if we have mutiple _ like above file name and
and i just need
RYK11603_5692601.pdf
panyam
March 20, 2013, 2:29pm
10
Pls post the sample possible inputs and the output expected.
1 Like
Here is a suggestion:
$ echo $file_name1
RYK11603_PLK5692601_RKYADAV_RHGT_GHTH.PDF
$ echo $file_name1 | sed "s/\(.*\)_PLK\([0-9]*\)_.*/\1_\2.pdf/"
RYK11603_5692601.pdf
1 Like
@tadavricky . I tried you original solution on Solaris 10 and it worked without a hitch..
As an alternative you could try:
printf "%s\n" "${file_name1}"| awk -F'[-~_][A-Z]*' '{sub(/.*#/,x); print $1=$1 "_" $2 tolower ($3)}'
On Solaris use /usr/xpg4/bin/awk
1 Like
scrutinizer:
@tadavricky . I tried you original solution on Solaris 10 and it worked without a hitch..
As an alternative you could try:
printf "%s\n" "${file_name1}"| nawk -F'[-~_][A-Z]*' '{sub(/.*#/,x); print $1=$1 "_" $2 tolower ($3)}'
@Scrutinizer thanks for reply
here is the one more strange thing
If we run the below command on HP UX it gives output as below which is i need
export var1=myguidelines
export var1_upper=MYGUIDELINES
i=pk#RYK30213_MYGUIDELINES_PAPER.PDF
file1=$(echo ${i}| cut -d"#" -f2| sed "s/\([!-~]*\)_${var1_upper}_\([!-~]*\).PDF/\1_${var1_upper}.pdf/")
echo $file1
output : RYK30213_MYGUIDELINES.pdf
output Needed: RYK30213_MYGUIDELINES.pdf
If we run the command on LInux the out is not correct.
export var1=myguidelines
export var1_upper=MYGUIDELINES
i=pk#RYK30213_MYGUIDELINES_PAPER.PDF
file1=$(echo ${i}| cut -d"#" -f2| sed "s/\([!-~]*\)_${var1_upper}_\([!-~]*\).PDF/\1_${var1_upper}.pdf/")
echo $file1
Output: RYK30213_MYGUIDELINES_PAPER.PDF
output Needed: RYK30213_MYGUIDELINES.pdf
it would be good learning if you can find the reason. also please help to correct the sed
---------- Post updated at 05:45 PM ---------- Previous update was at 05:25 PM ----------
export var1=myguidelines
export var1_upper=MYGUIDELINES
i=pk#RYK30213_MYGUIDELINES_PAPER.PDF
file1=$(echo ${i}| cut -d"#" -f2| sed "s/\([^-~]*\)_${var1_upper}_\([^-~]*\).PDF/\1_${var1_upper}.pdf/")
echo $file1
output: RYK30213_MYGUIDELINES.pdf
i=pk#RYK30213_MYGUIDELINES_PAPER_RFR_PF.PDF
file1=$(echo ${i}| cut -d"#" -f2| sed "s/\([^-~]*\)_${var1_upper}_\([^-~]*\).PDF/\1_${var1_upper}.pdf/")
echo $file1
output/ RYK30213_MYGUIDELINES.pdf
The out is correct after i replace the ^ with !
do we have any explaination why?
Thanks for all your help and input, last answer i look is why?
Try:
printf "%s\n" "${i}"| cut -d"#" -f2| sed "s/\([[:punct:][:alnum:]]*\)_${var1_upper}_\([[:punct:][:alnum:]]*\)\.PDF/\1_${var1_upper}.pdf/"
It finally dawned on me that [!-~]
is of course the range of ascii printable characters between !
and ~
:rolleyes:, and that probably this will give different results, depending on locale settings.
1 Like