Hi ,
I have a requirement to read a file ( 5 fields , ~ delimited) and find the records which contain anything other than Alphabets, Numbers , comma ,space and dot . ie a-z and A-Z and 0-9 and . and " " and , in 2nd field. Once I do that i would want the result to have field1|<flag>
flag can be Y or N .
N - If 2nd field doesnt have anything other above mentioned characters.
Else Y .
I am able to achieve this using below code by reading line by line . Please note second field is "address".
#!/bin/ksh
rm -f ca_sc_flag.txt
while read rec
do
cust_id=`echo $rec | cut -d'~' -f1`
addr=`echo $rec | cut -d'~' -f2`
addr_rem=`echo ${addr}|tr -d 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789,. '`
if [ -z "${addr_rem}" ]; then
sc='N'
echo "$cust_id|$sc" >> ca_sc_flag.txt
else
sc='Y'
echo "$cust_id|$sc" >> ca_sc_flag.txt
fi
done < ca.txt
The issue is it is very ineffective and takes almost 30 mins for 100K records. Can I improve it by using better logic. May be by avoiding reading line by line.