complex requirement

tnvanathy23 · August 18, 2011, 8:24am

i have a requirement to search a pattern1 and once the pattern1 is found i have to go up and search pattern2 and if pattern2 is found i have to search down for pattern3 and pattern4. once this cycle is compelted, again i have to search pattern1 and pattern2,3,4

i am able to do this by opening the file in vi mode. but pattern1 will be repeating many times and i want to create this as a shell script. as far i am aware i can move the search pointer up and down in vi mode but not using grep or sed or awk commands.

is there any way to run the vi commands in shell script?

sample text file

pattern2
xxxxxxxxxx
pattern3
xxxxxxx
yyyyyyyyyyy
pattern1
zzzzzzzz
zzzzzzzzz
pattern4
cccccccccccc
vvvvvvvvvvvvv
pattern2
iiiiiiiiiiiiiiiiiiiiiiiiii
pattern3
ggggggggggggggggg
hhhhhhhhhhhhhhhhhh
pattern1
llllllllllllllllllllllllll
kkkkkkkkkkkkkk
pattern4

i also want to repeat the cycle until pattern1 is found.

can any one help me on this?

michaelrozar17 · August 18, 2011, 8:35am

So whats the output expected from the above sample file..? Can you post us..?

tnvanathy23 · August 19, 2011, 6:59am

According to this requirement pattern1, pattern2, pattern3 & pattern 4 are fixed values.

pattern3 is the first key for searching. only if pattern3 is found i will search for rest of the patterns.

so on searching if i found first occurence of pattern3 then i will search up for pattern1 and then down for pattern2 and i want to fetch the word next to pattern2 and write it to new file and again search for pattern4.

For better understanding the scenario is like this, pattern1 and pattern4 will indicate the start and ending of a job. the word next to pattern 2 is the name of the job. and pattern3 is key word for example say any database name.

i will have a file that will contain many jobs and i want to find in which jobs i use that database. thats why first i am searching for database name(pattern3) and if it is found i am travelling up to find the starting code for that job using pattern1 and then immediate down occurence for pattern2. (many pattern2 will be found. i want the the pattern2 below pattern1). and once pattern2 is found i will capture the next word (i.e) job name and go down and search for end of that job code(pattern 4).

then again i will go and search for pattern3 if it is present in anyother job.

like this i have to search in that file.

michaelrozar17 · August 19, 2011, 10:16am

Check if the below one works.. if not you would need to post the few lines from actual file.

 sed -n 'H;/pattern4/{x;/pattern3/s/.*pattern1[^\n]*\n[^n][^2]*pattern2 \([^\n]*\).*/\1/p}' inputfile

tnvanathy23 · August 23, 2011, 3:45am

hi Michael,

Thanks for your help. The command is executing. but where should i check the output.

The sample lines in the file to be searched is

 
BEGIN TEST
   NAME "A0001_ext_PBIG_Override"
   DateModified "2010-09-23"
   TimeModified "09.15.32"
BEGIN LOOP
      NAME "V0"
      PrimaryType "CContainerView"
      Name "Job"
OR
B.DB_NAME='TEST_DB'
END LOOP
END TEST
BEGIN TEST
   NAME "B0001_ext_Dump"
   DateModified "2010-09-24"
   TimeModified "09.15.32"
BEGIN LOOP
      NAME "V0"
      PrimaryType "CContainerView"
      Name "Job"
OR
c.TEST_NAME='VALUE'
END LOOP
END TEST

in the above sample i want to search for the word "TEST_DB" (i.e pattern3)
once pattern3 is found i have to go above and search the word "BEGIN TEST" (i.e pattern1) and if pattern1 is found then i have to search NAME (i.e pattern2) which comes immediately after pattern1. if pattern2 is found then i want to capture the job name next to pattern2 i.e "A0001_New_Dump" in another file and finally go and search the last word "END TEST" (i.e pattern4). searching for pattern4 will bring the search complete in one job.

after END_TEST then next job will start with BEGIN_TEST. so like this i have to find in which jobs the pattern3 is used.

michaelrozar17 · August 23, 2011, 5:10am

If you simply run the command in the terminal, output will be displayed in the terminal itself. If you want to re-direct the output to a file then try below..

sed -n 'H;/END TEST/{x;/TEST_DB/s/.*BEGIN TEST\n *NAME \([^\n]*\).*/\1/p}' inputfile > outfile

tnvanathy23 · August 24, 2011, 3:51am

Hi Michael,

Thanks a lot for your help. The command is working fine. I would like to know how you framed this command. so that it will be better understanding for me and in case if any enhancement is required i can try with this base..

michaelrozar17 · August 24, 2011, 5:06am

The Sed command until it finds pattern END TEST puts all the lines in hold buffer (H). Once it matches a line with pattern END TEST, it starts searching for pattern TEST_DB from the lines present in the buffer, if this is found then it extracts the job name. This logic/loop continues until the end of file. To understand better technically you would need to go through Sed tutorial and Regular expressions.

tnvanathy23 · August 24, 2011, 9:25am

Hi Michael,

The command is working fine for the sample code which i gave in previous post. In real scenario for the file in which I should find the job names it is not working. The original file is around is around 500 MB containing more than 2000 jobs. So i am very confused why it is not finding the rows in that file. but the same code i tested with sample file. it is correctly picking the job names.

the command also not giving any error. it is executing for some time and finally the output file is blank.

binlib · August 24, 2011, 5:13pm

awk '/BEGIN TEST/ {
  getline
  n = $2
  while (getline > 0) {
    if (/TEST_DB/) {
      print n
      exit
    }
  }
}'