grep on multiple words to match text template

rider29 · May 20, 2008, 11:58am

hi,

I have few text templates
as a simple ex:
template 1

city Name:
zip code:
state Name:

template2:

employee Name:
Phone number:

I wish to grep on given text file and make sure the text file matches one of these templates. Please give your ideas.

joeyg · May 20, 2008, 12:14pm

Please clarify... for example
option #1 (each record in own file)
file1
City: Brockton
Zip: 02330
State: MA

file2
City: Boston
Zip: 02109
State: MA

OR option #2 (all records in one file)
file1
City: Brockton
Zip: 02330
State: MA
City: Boston
Zip: 02109
State: MA

Also, clarify on if variable names (like City:) preceed all data.

rider29 · May 20, 2008, 12:32pm

Thank you for looking into the post.

option 1 is correct. Each file will comply with one template only and Variables will always precede the data.Please give your idea to write a script that will
read the text and match these templates.

joeyg · May 21, 2008, 2:39pm

As a test, I created 3 sample files. The first two have the proper layout, but the third is missing a field. The program is quite simple in that it counts each successful element. If three, message saying ok but if not then message saying bad.

> cat file1
City: Brockton
Zip: 02330
State: MA

> cat file2
City: Boston
Zip: 02109
State: MA

> cat file3
City: Boston
Zip: 02109
>

> cat ck_format 
#! /bin/bash
xf="file"
cnt=1
max=5

while [ $cnt -le $max ]
   do
   zf="$xf""$cnt"
#   echo $zf
#verify file integrity
   if [ -s $zf ]
      then
      flag=0
      testf=$(cat $zf | grep "^City:")
      if [ -n "$testf" ]
         then
         flag=$((flag+1))
      fi
      testf=$(cat $zf | grep "^Zip:")
      if [ -n "$testf" ] 
         then
         flag=$((flag+1))
      fi
      testf=$(cat $zf | grep "^State:")
      if [ -n "$testf" ] 
         then
         flag=$((flag+1))
      fi
      if [ $flag -eq 3 ]
         then
          echo "The file "$zf" is a good file"
         else
          echo "The file "$zf" is a bad file"
      fi   
   fi
   cnt=$((cnt+1))
done

program execution is:
> ck_format
The file file1 is a good file
The file file2 is a good file
The file file3 is a bad file

rider29 · May 21, 2008, 3:17pm

I thank you for the help.

jim_mcnamara · May 21, 2008, 4:38pm

You can try awk:

#!/bin/ksh
# prints a 1 if file is correctly either template 1 or template2,
#          else print 0

template_check()
{
	awk ' /^City/     {template1++}
	      /^Zip/      {template1++}
	      /^State/    {template1++}
          /^Employee/ {template2++}
          /Phone/     {template2++}
          END { if(!template1 && template2==2 || template1==3 && !template2)
                {print 1}
                else
                {print 0}
              }' "$1"
}

# test of template_check
echo "File1
City: Brockton
Zip: 02330
State: MA
" > file1

echo "template_check gives $(template_check file1)"

echo "file2
City: Boston
Zip: 02109
State: MA
" > file2

echo "template_check gives $(template_check file2)"

echo "file3
City: Boston
Zip: 02109
" > file3

echo "template_check gives $(template_check file3)"

echo "file4
Employee: John
Phone: 3456
" > file4

echo "template_check gives $(template_check file4)"

echo "file5
City: Boston
Phone: 3456
" > file5

echo "template_check gives $(template_check file5)"

output

template_check gives 1
template_check gives 1
template_check gives 0
template_check gives 1
template_check gives 0

rider29 · May 23, 2008, 11:21am

thank you ! I am new to Unix scripting, your logic helped me a lot to build my script. I have few more questions

How can include " case Ignore " while checking for the template? &
Is there a way to check the order of the Words "City", "Zip" & "state" can be verified ie.. can We should the file is bad in case the file containts text like

Zip:
City:
State:

something like awk '/City/' ---> only in first line and then update template1 to template1+1??