I have a file called clientname_filename.csv
whose contents are like
col1|col2|col3|col4|
510|abc|xxx|450|
510|abc11|yyy|350
510|pqr99|zzz| 670
512|222|439|110
Here i have check the contents of column for data type.
i have a constraints that col1 always contain Numeric value column 2 can contain char type or alphanumeric ,column 3 should always char type,column 4 always numeric value
here i have a constarinfile called clientname_filename_constraint.csv
contents of this file is nothing but the datatype each column contain
col1|col2|col3|col4
Nu|char|char|Nu
Based on this file i want to check the main file and verify the data type of the column if any column violate the constraints it should show error.
here col 1 is right as it contain only numeric,col2 is invalid as it contain a numeric at the end, column 3 is also invalid for same reason. Column is right as it contain numeric only .
How will i check whether a column contain only numeric value or char value ??
please assist !!
try a shell script like this - you need to order the isnumeric and isalpha tests
per your requirements, I have them as col1 + col4 numeric col2 + col4 alpha:
#!/bin/ksh
isnumeric()
{
result=$(echo "$1" | tr -d '[[:digit:]]')
echo ${#result}
}
isalpha()
{
result=$(echo "$1" | tr -d '[[:alpha:]]')
echo ${#result}
}
col1=""
col2=""
col3=""
col4=""
let retval=1
while read record
do
echo "$record" | awk -F"|" '{print $1, $2, $3, $4 }' | read col1 col2 col3 col4
if [[ $(isnumeric "$col1") -eq 1 && $(isnumeric "$col4") -eq 1 ]]; then
retval=1
else
retval=0
break
fi
if [[ $(isalpha "$col2") -eq 1 && $(isalpha "$col3") -eq 1 ]]; then
retval=1
else
retval=0
break
fi
done < filename
if [[ $retval -eq 1 ]]; then
echo "test is okay"
else
echo "test failed for this row:"
echo "$col1 $col2 $col3 $col4"
fi
Anbu & Jim thanks a lot for the clue
Can i make this $1 or $2 as variable as for some file $1 can be numeric but for other file the first column constrint could be character
so depend upon the constrint file i want to check the main file.
>>> for lines in open("clientname_filename.csv"):
... lines = lines.strip().split("|")
... if not lines[0].isdigit() or not lines[3].isdigit():
... print "First or fourth col not numeric: %s" %(lines)
... elif not lines[1].isalnum():
... print "Second col not alphanumeric: %s" %(lines)
... elif not lines[2].isalpha():
... print "Third col not char type: %s" %(lines)
this line work fine for all the test value like
23232@@
ww232323
2322%.%
3434*
-----
but why 4343$$44 is showing as numeric value ???
also 343$34
show as numeric and showing 3434
This expression is working for all other but not when the value contain $
can any one help ??
also i tried with echo $var | sed -n '/^[0-9][0-9]*$/p' same thing happened
any thing consist of 0-9 should be considered as numeric , this sed expression working fine except for the value 445$$343 or 23$899 some thing like this which contain $ , it is considering that as numeric .
"^[0-9][0-9]*$" with regular expression to check for numeric is failed as it is accepting 4545$22 as numeric aslo 34343&232 is also as numeric
i want to accept 343.343 as numeric and 343343 as numeric other then that are non numeric.
I mean only decimal place and pure numeric form are considerd as numeric .
any special char inside is non numeric