Cleaning AWK code

Hi

I need some help to clean my code used to get city location.

wget -q -O - http://www.ip2location.com/ | grep chkRegionCity | awk 'END { print }' | awk -F"[,<]" '{print $4}'

It gives me the city but have a leading space.
I am sure this could all be done by one single AWK

Also if possible to change from " OSLO" to "Oslo"

Example output from

wget -q -O - http://www.ip2location.com/ | grep chkRegionCity

Reult

                                <td><input type="checkbox" name="region" id="chkRegionCity" onchange="pickProduct();"></td>
                                <td><label for="chkRegionCity">Region & City</label></td>
                                <td><label for="chkRegionCity">SOR-TRONDELAG, TRONDHEIM</label></td>

Here i like t get

Trondheim

Try this awk (I don't like it too much; too much fumbling and fiddling and cheating):

awk -F"[<>,     ]*" '/label.*RegionCity.*, / {print substr($5,1,1)tolower(substr($5,2))}'

Pipe it from your wget cmd.

used Rudic's last code... to change the case..:slight_smile:

wget -q -O - http://www.ip2location.com/ | awk -F "[>,<]" '/chkRegionCity/{gsub(" ","",$6);s=$6}END{print substr(s,1,1)tolower(substr(s,2))}'

@RudiC

awk -F"[<>,     ]*" '/label.*RegionCity.*, / {print substr($5,1,1)tolower(substr($5,2))}'

Does not work correct, since region may contain spaces.
Like

<td><label for="chkRegionCity">MORE OG ROMSDAL, KRISTIANSUND</label></td>

@pamu

wget -q -O - http://www.ip2location.com/ | awk -F "[>,<]" '/chkRegionCity/{gsub(" ","",$6);s=$6}END{print substr(s,1,1)tolower(substr(s,2))}'

Works fine

OK, remove the space & tab from the FS definition and use $6 instead of $5:

awk -F"[<>,]" '/label.*RegionCity.*, / {print substr($6,2,1)tolower(substr($6,3))}' file
Kristiansund

Now it works :slight_smile:

i am not able to understand the field seperator here :frowning:

can someone please explain what this means

"[<>,]"

It means use any of <, > or , (comma) as a field separator.

# echo "a<b>c,d" | awk -F"[<>,]" '{print $4, $3, $2, $1}'
d c b a

Field separators are defined in many ways...
"[<>,]" - Here three filed separators are present < , > and , .
normally in [] bracket multiple FS are defined which are generally single characters like above.

You can define string/multiple characters as FS also... like below..

">>|<<|AB" - Here two filed separators are >> , << and AB .