How to check if a column is having a numeric value or not in a file?

Hi,

I want to know, how we find out if a column is having a numeric value or not.

For Example if we have a csv file as

ASDF,QWER,GHJK,123,FGHY,9876
GHTY,NVHR,WOPI,623,HFBS,5386

we need to find out if the 4th and 6th column has muneric value or not.

Thanks in advance

Keerthan

This will print only the records with numeric fourth and sixth fileds:

perl '-MScalar::Util qw/looks_like_number/' '-F, -ane
    print if looks_like_number $F[3] 
      and looks_like_number $F[5]
    ' infile

If you want all of them:

perl '-MScalar::Util qw/looks_like_number/' '-F, -lane
    print $_, " -> ", (looks_like_number $F[3] 
	  and looks_like_number $F[5])? "valid" : "invalid"
	' infile
#!/bin/ksh or bash or dash or ...
cat <<EOF > $0.txt
ASDF,QWER,GHJK,123,FGHY,9876
GHTY,NVHR,WOPI,623,HFBS,5386
GHTY,NVHR,WOPI,A623,HFBS,5386
EOF

num()
{
   val="$1"
   # replace numbers with nothing, rest chars are something else
   otherchars="${val//[0-9]/}"
   [ "$otherchars" != "" ] && return 1 # not num
   return 0 # it's num
}
############################

oifs="$IFS"
cat $0.txt | while read line
do
        IFS=","
        flds=($line)
        IFS="$oifs"
        f4=${flds[3]}
        f6=${flds[5]}
        num "$f4" || echo "not num $f4"
        num "$f6" || echo "not num $f6"
done

WHats the keyword to find out if its a string instead of a number.

Another approach:

awk -F, 'match($4,"[a-zA-Z]")||match($6,"[a-zA-Z]"){print $0 " -> invalid";next}1' file

Try this , which will show lines with only numbers 4th and 6th columns

awk -F, 'int($4)==$4 && int($6)==$6' file

And yet another Perl script -

$
$ cat f0
ASDF,QWER,GHJK,123,FGHY,9876
GHTY,NVHR,WOPI,623,HFBS,5386
ABCD,EFGH,IJKL,23A,MNOP,9999
PQRS,TUVW,XYZA,999,BCDE,489X
$
$ perl -F, -lane 'print if $F[3]=~/^\d+$/ && $F[5]=~/^\d+$/' f0
ASDF,QWER,GHJK,123,FGHY,9876
GHTY,NVHR,WOPI,623,HFBS,5386
$
$

tyler_durden

sed -n '/.[^,]*,.[^,]*,.[^,]*,[0-9]\+,.[^,]*,[0-9]\+/p' file

Hi Kurumi,

Whats the significance of using a .

sed -n '/.[^,]*,.[^,]*,.[^,]*,[0-9]\+,.[^,]*,[0-9]\+/p' file

Thanks in advance

Keerthan

kurumi obviously didn't want to count empty columns as columns and therefore made sure there is anything in between. His regexp will match "a,b,c,..." but not "a,,,b,..." This is what the full stop is for.

bakunin