File name lower to upper in Shell

yadavricky · March 20, 2013, 6:56am

I have a file

file_name1=RYK11603_PLK5692601_RKYADAV.PDF

i am using the below command to convert this file to RYK11603_5692601.pdf

file_name=$(echo ${file_name1}| cut -d"#" -f2|  sed "s/\([!-~]*\)_PLK\([0-9]*\)_\([!-~]*\).PDF/\1_\2.pdf/")

but no success can somebody help on thi.

Scrutinizer · March 20, 2013, 7:02am

I tried your command and it seems to work for your input:

$ file_name1=RYK11603_PLK5692601_RKYADAV.PDF
$ printf "%s\n" "${file_name1}"| cut -d"#" -f2|  sed "s/\([!-~]*\)_PLK\([0-9]*\)_\([!-~]*\).PDF/\1_\2.pdf/"
RYK11603_5692601.pdf

anbu23 · March 20, 2013, 7:03am

$ echo ${file_name1}| cut -d"#" -f2| sed "s/\([^_]*\)_PLK\([0-9]*\).*PDF/\1_\2.pdf/"
RYK11603_5692601.pdf

panyam · March 20, 2013, 7:06am

Something like this:

echo "RYK11603_PLK5692601_RKYADAV.PDF" | sed 's/\([A-Z0-9].*\)_[A-Z]*\([0-9].*\)_.*/\1_\2.pdf/'

Scrutinizer: Even for me the code provided by "yadavricky" does not gave the proper output..

I am on Gnu/Linux.

$ file_name1=RYK11603_PLK5692601_RKYADAV.PDF
$ printf "%s\n" "${file_name1}"| cut -d"#" -f2|  sed "s/\([!-~]*\)_PLK\([0-9]*\)_\([!-~]*\).PDF/\1_\2.pdf/"
RYK11603_PLK5692601_RKYADAV.PDF

yadavricky · March 20, 2013, 7:20am

panyam:

Something like this:
echo "RYK11603_PLK5692601_RKYADAV.PDF" | sed 's/$[A-Z0-9].*$_[A-Z]*$[0-9].*$_.*/\1_\2.pdf/'
Scrutinizer: Even for me the code provided by "yadavricky" does not gave the proper output..

I am on Gnu/Linux.
$ file_name1=RYK11603_PLK5692601_RKYADAV.PDF
$ printf "%s\n" "${file_name1}"| cut -d"#" -f2|  sed "s/$[!-~]*$_PLK$[0-9]*$_$[!-~]*$.PDF/\1_\2.pdf/"
RYK11603_PLK5692601_RKYADAV.PDF

sorry I forgot to update in post
i will be using solaris, ksh

and the file would be like
file_name1=pk#RYK11603_PLK5692601_RKYADAV.PDF

when i run the suggested code i get
ryk11603_5692601.pdf

Scrutinizer · March 20, 2013, 7:22am

@panyam, with GNU sed, I get:

$ printf "%s\n" "${file_name1}"| cut -d"#" -f2|  gsed "s/\([!-~]*\)_PLK\([0-9]*\)_\([!-~]*\).PDF/\1_\2.pdf/"
RYK11603_5692601.pdf

must be down to GNU sed versions then.

--edit---
Never mind it is of course a range of characters....

What happens when you replace the ! with ^ which is the proper negation operator in regex?

printf "%s\n" "${file_name1}"| cut -d"#" -f2|  sed "s/\([^-~]*\)_PLK\([0-9]*\)_\([^-~]*\).PDF/\1_\2.pdf/"

yadavricky · March 20, 2013, 7:46am

scrutinizer:

@panyam, with GNU sed, I get:
$ printf "%s\n" "${file_name1}"| cut -d"#" -f2|  gsed "s/$[!-~]*$_PLK$[0-9]*$_$[!-~]*$.PDF/\1_\2.pdf/"
RYK11603_5692601.pdf
must be down to GNU sed versions then.

What happens when you replace the ! with ^ ?
printf "%s\n" "${file_name1}"| cut -d"#" -f2|  sed "s/$[^-~]*$_PLK$[0-9]*$_$[^-~]*$.PDF/\1_\2.pdf/"

@Scrutinizer Thanks but it did not work in bash and ksg on solaris..
it gives output something as below
ryk11603_PLK5692601_RKYADAV.PDF

---------- Post updated at 06:39 AM ---------- Previous update was at 06:31 AM ----------

panyam:

Something like this:
echo "RYK11603_PLK5692601_RKYADAV.PDF" | sed 's/$[A-Z0-9].*$_[A-Z]*$[0-9].*$_.*/\1_\2.pdf/'
Scrutinizer: Even for me the code provided by "yadavricky" does not gave the proper output..

I am on Gnu/Linux.
$ file_name1=RYK11603_PLK5692601_RKYADAV.PDF
$ printf "%s\n" "${file_name1}"| cut -d"#" -f2|  sed "s/$[!-~]*$_PLK$[0-9]*$_$[!-~]*$.PDF/\1_\2.pdf/"
RYK11603_PLK5692601_RKYADAV.PDF

Ok Sorry another condition...
cosider the file it may be like this also

ryk11603_PLK5692601_RKYADAV.PDF

now what should be approach

---------- Post updated at 06:46 AM ---------- Previous update was at 06:39 AM ----------

Dear Anbu,
what if the file name would be as below

ryk11603_PLK5692601_RKYADAV.PDF

and we look for
RYK116038_5692601.pdf

Please suggest

anbu23 · March 20, 2013, 7:57am

$ file_name1=ryk11603_PLK5692601_RKYADAV.PDF
$ echo ${file_name1}| cut -d"#" -f2 | tr [:lower:] [:upper:] | sed "s/\([^_]*\)_PLK\([0-9]*\).*PDF/\1_\2.pdf/"
RYK11603_5692601.pdf

yadavricky · March 20, 2013, 1:12pm

file_name1=RYK11603_PLK5692601_RKYADAV_RHGT_GHTH.PDF
What if we have mutiple _ like above file name and
and i just need
RYK11603_5692601.pdf

panyam · March 20, 2013, 2:29pm

Pls post the sample possible inputs and the output expected.

hanson44 · March 20, 2013, 2:41pm

Here is a suggestion:

$ echo $file_name1
RYK11603_PLK5692601_RKYADAV_RHGT_GHTH.PDF
$ echo $file_name1 | sed "s/\(.*\)_PLK\([0-9]*\)_.*/\1_\2.pdf/"
RYK11603_5692601.pdf

Scrutinizer · March 20, 2013, 3:10pm

@tadavricky. I tried you original solution on Solaris 10 and it worked without a hitch..
As an alternative you could try:

printf "%s\n" "${file_name1}"| awk -F'[-~_][A-Z]*' '{sub(/.*#/,x); print $1=$1 "_" $2 tolower ($3)}'

On Solaris use /usr/xpg4/bin/awk

yadavricky · March 20, 2013, 6:45pm

scrutinizer:

@tadavricky. I tried you original solution on Solaris 10 and it worked without a hitch..
As an alternative you could try:
printf "%s\n" "${file_name1}"| nawk -F'[-~_][A-Z]*' '{sub(/.*#/,x); print $1=$1 "_" $2 tolower ($3)}' 

@Scrutinizer thanks for reply
here is the one more strange thing

If we run the below command on HP UX it gives output as below which is i need

export var1=myguidelines
export var1_upper=MYGUIDELINES
i=pk#RYK30213_MYGUIDELINES_PAPER.PDF
file1=$(echo ${i}| cut -d"#" -f2|  sed "s/\([!-~]*\)_${var1_upper}_\([!-~]*\).PDF/\1_${var1_upper}.pdf/")
echo $file1

output : RYK30213_MYGUIDELINES.pdf
output Needed: RYK30213_MYGUIDELINES.pdf

If we run the command on LInux the out is not correct.

export var1=myguidelines
export var1_upper=MYGUIDELINES
i=pk#RYK30213_MYGUIDELINES_PAPER.PDF
file1=$(echo ${i}| cut -d"#" -f2|  sed "s/\([!-~]*\)_${var1_upper}_\([!-~]*\).PDF/\1_${var1_upper}.pdf/")
echo $file1

Output: RYK30213_MYGUIDELINES_PAPER.PDF
output Needed: RYK30213_MYGUIDELINES.pdf

it would be good learning if you can find the reason. also please help to correct the sed

---------- Post updated at 05:45 PM ---------- Previous update was at 05:25 PM ----------

export var1=myguidelines
export var1_upper=MYGUIDELINES
i=pk#RYK30213_MYGUIDELINES_PAPER.PDF
file1=$(echo ${i}| cut -d"#" -f2| sed "s/$[^-~]*$_${var1_upper}_$[^-~]*$.PDF/\1_${var1_upper}.pdf/")
echo $file1
output: RYK30213_MYGUIDELINES.pdf

i=pk#RYK30213_MYGUIDELINES_PAPER_RFR_PF.PDF
file1=$(echo ${i}| cut -d"#" -f2| sed "s/$[^-~]*$_${var1_upper}_$[^-~]*$.PDF/\1_${var1_upper}.pdf/")
echo $file1
output/ RYK30213_MYGUIDELINES.pdf

The out is correct after i replace the ^ with !

do we have any explaination why?

Thanks for all your help and input, last answer i look is why?

Scrutinizer · March 20, 2013, 6:55pm

Try:

printf "%s\n" "${i}"| cut -d"#" -f2| sed "s/\([[:punct:][:alnum:]]*\)_${var1_upper}_\([[:punct:][:alnum:]]*\)\.PDF/\1_${var1_upper}.pdf/"

It finally dawned on me that [!-~] is of course the range of ascii printable characters between ! and ~ :rolleyes:, and that probably this will give different results, depending on locale settings.