Shell script for validating fields in a file

Hi,

I have not used Unix in a very long time and I am very rusty. I would appreciate any help I can get from the more experienced and experts in Shell script.

I am reading one file at a time from a folder. The file is a flat file with no delimeters or carriage return. Col1 through col6 is the header section and should be read once. Col7 - 14 is the body and can have 1 or multiple records. I read the file check to make sure none of these fields are blank. If there is a blank field, I would like to write a message to a log file that can be viewed later. I haven't gotten to the log file yet. Just trying to read the file with an inner loop just doesn't seem to work. Please help me.

 
#!/bin/ksh
 
#This script was created on 01/27/2010 to check outbound
# files for missing required fields
###########################################################
 
 
for file in /folder/file.1$.5$
 
do
 
i=1
 
exec< file
 
while read line
do
 
col1=`echo $line | cut -c1-4`
col2= `echo $line | cut -c5-5`
col3= `echo $line | cut -c6-7`
col4= `echo $line | cut -c8-16`
col5= `echo $line | cut -c17-23`
col6= `echo $line | cut -c51-56`
 
if [ -z "$col1" ]
then
echo "Line No. $i -- No String in position 1-4 "
else
echo "Line No. $i -- String in position 1-4 : $col1"
fi
 
if [ -z "$col2" ]
then
echo "Line No. $i -- No String in position 5 "
else
echo "Line No. $i -- String in position 5 : col2"
fi
 
if [ -z "$col3" ]
then
echo "Line No. $i -- No String in position 6-7 "
else
echo "Line No. $i -- String in position 6-7 : $col3"
fi
 
if [ -z "$col4" ]
then
echo "Line No. $i -- No String in position 8-16 "
else
echo "Line No. $i -- String in position 8-16 : $col4"
fi
 
if [ -z "$col5" ]
then
echo "Line No. $i -- No String in position 17-23 "
else
echo "Line No. $i -- String in position 17-23: $col5"
fi
 
if [ -z "$col6" ]
then
echo "Line No. $i -- No String in position 51-56 "
else
echo "Line No. $i -- String in position 51-56 : $col6"
fi
 
i=`expr $i + 1`
 
exec< file
 
 While read line
    do
 
        col7=`echo $line | cut -c66-72`
        col8= `echo $line | cut -c73-74`
        col9= `echo $line | cut -c75-83`
        col10= `echo $line | cut -c84-85`
        col11= `echo $line | cut -c86-88`
        col12= `echo $line | cut -c89-96`
        col13= `echo $line | cut -c97-100`
        col14= `echo $line | cut -c108-115`
 
 
        if [ -z "$col7"] or [ -z "$col8"] or [ -z "$col9"] or [ -z "$col10"]
          or[ -z "$col11" ] or [ -z "$col12" ] or [ -z "$col13"] or [ -z "$col14"]
        then
        echo "Line No. $i -- Missing String "
        else
        echo "Line No. $i -- String is valid"
        fi
 
        i=`expr $i + 1`
        done
done
 
done
 

Thank you all for your help.

Remove space between = and `

col2= `echo $line | cut -c5-5`

If you dont have data for a column, then do you have blank spaces?

Can you show us input?

Thanks for responding. I appreciate it.

Yes if there are no data, it will show up as blanks

1111B0216262626111111 999999 22222227823876489050456201001222010 20101031

The date above will represent one record. So a second record will be

1111B0216262626111111 245367 33333338860870499050456201001182010 20100930

looking at the data again, I changed the code to . Col1 through col14 are in every record in the file.

 
for file in /folder/file.1$.5$
 
do
 
i=1
 
exec< file
 
while read line
do
 
col1=`echo $line | cut -c1-4`
col2= `echo $line | cut -c5-5`
col3= `echo $line | cut -c6-7`
col4= `echo $line | cut -c8-16`
col5= `echo $line | cut -c17-23`
col6= `echo $line | cut -c51-56`
 col7=`echo $line | cut -c66-72`
 col8= `echo $line | cut -c73-74`
 col9= `echo $line | cut -c75-83`
 col10= `echo $line | cut -c84-85`
 col11= `echo $line | cut -c86-88`
 col12= `echo $line | cut -c89-96`
 col13= `echo $line | cut -c97-100`
 col14= `echo $line | cut -c108-115`
 
if [ -z "$col1" ]
then
echo "Line No. $i -- No String in position 1-4 "
else
echo "Line No. $i -- String in position 1-4 : $col1"
fi
 
if [ -z "$col2" ]
then
echo "Line No. $i -- No String in position 5 "
else
echo "Line No. $i -- String in position 5 : col2"
fi
 
if [ -z "$col3" ]
then
echo "Line No. $i -- No String in position 6-7 "
else
echo "Line No. $i -- String in position 6-7 : $col3"
fi
 
if [ -z "$col4" ]
then
echo "Line No. $i -- No String in position 8-16 "
else
echo "Line No. $i -- String in position 8-16 : $col4"
fi
 
if [ -z "$col5" ]
then
echo "Line No. $i -- No String in position 17-23 "
else
echo "Line No. $i -- String in position 17-23: $col5"
fi
 
if [ -z "$col6" ]
then
echo "Line No. $i -- No String in position 51-56 "
else
echo "Line No. $i -- String in position 51-56 : $col6"
fi

 if [ -z "$col7"] or [ -z "$col8"] or [ -z "$col9"] or [ -z "$col10"]
 or[ -z "$col11" ] or [ -z "$col12" ] or [ -z "$col13"] or [ -z "$col14"]
 then
 echo "Line No. $i -- Missing String "
 else
 echo "Line No. $i -- String is valid"
 fi

 
i=`expr $i + 1`


done
 
done

Also instead of using all these if statements, can I use a case? if yes how do I do it?

Thanks in advance for your help.

If you run that code over dozens of big files it will take forever. Every one of those backtick lines creates a separate process.

awk (I used nawk, same thing ) was meant for stuff like this.
create a file: pos.txt that has the offsets

1     4
5     5
6     7
8    16
17   23
51   56     
66   72             
73   74    
75   83    
84   85  
86   88  
89   96  
97  100 
108 115
for infile in /folder/file.1$.5$   
do

  nawk -v infile=$infile  ' {
      if(FILENAME=="pos.txt" )
      { 
           pos[FNR]=$1; len[FNR]=($2-$1)+1; vals++; next 
      }
      if(FILENAME==infile )
      {         
           for(i=1; i<=vals; i++)
           {
             testval=substr($0,pos, len)
             gsub(/ /, "", testval)
             if(length(testval)==0) {print infile, "line:", FNR, " blank field:", i}
           }  
           next         
      } 
  } '  pos.txt  $infile

done > report.txt

Thanks Jim.

Now I am having problems opening the file

/folder/file.1$.5$

I have multiple files in this folder with the following names

file.10.5232004
file.12.2112003
.........

What do I do?

Thanks for your help again.

---------- Post updated at 03:40 PM ---------- Previous update was at 02:10 PM ----------

Thanks I got the code to work.

The problem I am having now is that it is repeating the same thing over and over again in the report.txt file.

If I wanted to print out only the file name, number of records and the length of each record for each file, how do I do this? with the code below

 
{print infile, "line:", FNR, " blank field:", i}

Thank you

---------- Post updated 01-29-10 at 12:47 AM ---------- Previous update was 01-28-10 at 03:40 PM ----------

Thank you!! I was able to find a solution in the forumn

---------- Post updated at 03:26 PM ---------- Previous update was at 12:47 AM ----------

I am having a hard time extracting the file name from the above code. In stead of printing /folder/file.1$.5$, I would like it to print the file name file.1$.5$.

I have tried using basename but it looks like NAWK or AWK does not recognise basename. Each time I type it in, it prints out the word basename

{print basename infile, "line:", FNR, " blank field:", i}

Thank you for your help!!!

---------- Post updated at 03:27 PM ---------- Previous update was at 03:26 PM ----------

I am having a hard time extracting the file name from the above code. In stead of printing /folder/file.1$.5$, I would like it to print the file name file.1$.5$.

I have tried using basename but it looks like NAWK or AWK does not recognise basename. Each time I type it in, it prints out the word basename

{print basename infile, "line:", FNR, " blank field:", i}

Thank you for your help!!!

hello everybody

i have almost the same problem with my files (i am on AIX 5.3), i am gonna try this solution
The main difference that in my case i don't have header but also i can't use "pos.txt"
because i have offest like that

 
1     2
3     5
6     7
8    10

11   12
13   15
16   17     
18   20             
....
121   122    
123   125    
126   127  
128   130  
....

so is there another way to fit for my purpose

thx jim for your great work and thx asemota for asking so i could fond a solution