URL encoding

Hi All,

I want to do URL encoding using shell script in my project. I decided that the sed is the correct tool to do this. But I am unable achieve what I wanted using sed. kindly help me to get rid of this.

My requirement is , there will be one URL with all special character, spaces etc...

For ex. xxxxxx.com - xxx sex videos free hd porn Resources and Information. ^this to?%checkthe@-functionality..I want to do URL encoding only after �?� mark. Final result should be xxxxxx.com - xxx sex videos free hd porn Resources and Information. ^this to?<encoded 2nd part>

Thanks in advance

Regards
Vichu

There is nothing after the ? which requires encoding except the percent sign. Perhaps you should pick a more detailed example.

It's not going to be very elegant to do this in sed because it requires a loop, and loops are kind of tricky in sed. Basically, stash away the part you don't want to encode, loop over the remaining part, moving away everything you have already encoded by appending it to the stash.

Maybe something like this, instead?

perl -ple 's/\?(.*)//;
my $tail = $1;
$tail =~ s/([%? +&!<>()])/sprintf "%%%02x", ord($1) /ge;
s/$/$tail/'

The list of characters which require or might benefit from escaping is quite probably not complete. This assumes you have nothing after the URL which is not part of the URL, and that the first question mark separates the tail which requires encoding from the base URL.

Seperating URL into two parts (? is the delimiter before ? is the first part and after ? is the second part) and I can do encoding for the second part using awk..But I don't want to split the line. Just skipping first part of URL and encoding the second part.

Did you try the code I posted?

Hi era,

Thanks for your promptness..But I don't want to do it in perl...My requirement is in shell..

Use 2 files:
urlencode.sed

s/%/%25/g
s/ /%20/g
s/ /%09/g
s/!/%21/g
s/"/%22/g
s/#/%23/g
s/\$/%24/g
s/\&/%26/g
s/'\''/%27/g
s/(/%28/g
s/)/%29/g
s/\*/%2a/g
s/+/%2b/g
s/,/%2c/g
s/-/%2d/g
s/\./%2e/g
s/\//%2f/g
s/:/%3a/g
s/;/%3b/g
s//%3e/g
s/?/%3f/g
s/@/%40/g
s/\[/%5b/g
s/\\/%5c/g
s/\]/%5d/g
s/\^/%5e/g
s/_/%5f/g
s/`/%60/g
s/{/%7b/g
s/|/%7c/g
s/}/%7d/g
s/~/%7e/g
s/      /%09/g

urlencode.sh

#!/bin/ksh

STR1=$(echo "https://www.xxxxxx.com/change&$ ^this to?%checkthe@-functionality" | cut -d\? -f1)
STR2=$(echo "https://www.xxxxxx.com/change&$ ^this to?%checkthe@-functionality" | cut -d\? -f2)

OUT2=$(echo "$STR2" | sed -f urlencode.sed)

echo "$STR1?$OUT2"

Result:

./urlencode.sh
https://www.xxxxxx.com/change&$ ^this to?%25checkthe%40%2dfunctionality

OK for you ? :cool:

I proposed the same procedure,whatever you have given here, using awk.But my team didnot accept it. They dont want to use extra varialbes.They would like to do encoding (IInd part) on the same variable.

My proposed steps

str1=`echo "xxxxxx.com - xxx sex videos free hd porn Resources and Information. ^this to?%checkthe@-functionality" | awk -F? '{print $1}'`

str2=`echo "xxxxxx.com - xxx sex videos free hd porn Resources and Information. ^this to?%checkthe@-functionality" | awk -F? '{print $2}'`

str3=`echo $str2 | sed -f ./seq.sed `

echo $str?$str3

Do you have any idea?

OK, in one file :

#!/bin/ksh

ORIG="https://www.xxxxxx.com/change&$ ^this to?%checkthe@-functionality"
STR1=$(echo $ORIG | cut -d\? -f1)
STR2=$(echo $ORIG | cut -d\? -f2)

OUT=$STR2

FORMULA="s/%/%25/g=s/ /%20/g=s/ /%09/g=s/!/%21/g=s/\"/%22/g=s/#/%23/g=s/\\\\$/%24/g=s/\&/%26/g"
FORMULA="${FORMULA}=s/'\''/%27/g=s/(/%28/g=s/)/%29/g=s/\*/%2a/g=s/+/%2b/g=s/,/%2c/g=s/-/%2d/g"
FORMULA="${FORMULA}=s/\./%2e/g=s/\//%2f/g=s/:/%3a/g=s/;/%3b/g=s/?/%3f/g=s/@/%40/g"
FORMULA="${FORMULA}=s/\\\\\\/%5b/g=s/\\\\\\\\/%5c/g=s/\]/%5d/g=s/\^/%5e/g=s/_/%5f/g=s/\`/%60/g=s/{/%7b/g"
FORMULA="${FORMULA}=s/|/%7c/g=s/}/%7d/g=s/~/%7e/g=s/      /%09/g"

i=1

while [ i ]
do
        z=$(echo "$FORMULA" | cut -d= -f$i)

        if [ ! x"$z" = x ]
        then
                y=$(echo $OUT | sed -e "$z")
                OUT=$y
        else
                break
        fi
        let i=$i+1
done

echo Original: $ORIG
echo URL encoded after '?': "$STR1?$OUT"

Result

./urlencode.sh
Original: https://www.xxxxxx.com/change&$ ^this to?%checkthe@-functionality
URL encoded after ?: https://www.xxxxxx.com/change&$ ^this to?%25checkthe%40%2dfunctionality

:b::b::confused::confused:

echo -e $(echo %23 | sed -e "s/%\([0-9a-f]\{2\}\)/\\\x\1/")