I have above file format from which I want to find a matching record. For example, match a number(7789) on line starting with XYZ and once matched look for a matching number (7345) in lines below starting with 1 until it reaches to line starting with 9. retrieve the entire line record. How can I accomplish this using shell script, awk, sed or any combination.
Sorry for not being clear, let me explain this data format/requirement. Records that starts with XYZ contains item number (7789) and lines below it starting with 1 have store number immediately after 1. Line starting with 1 indicates store record and a line xyz indicates record for item number. A store number is of 4 characters (7345). The entire line starting with 1 has other details related to store number like quantity etc. for the item number above it. These item-store combination block of records end at 9 and after that new item-store combination block of records start.
I have tried to solve this on my own using
grep -A5 7789 filename
but I can only get record of item number or store number but not both at same time and my command produces unnecessary store records which I don't want to see. I only want to see a specific item-store combination of the record. Also number of line for store records (starting with 1) varies for each item so I can not accurately predict at which line number after XYZ the store record will be.
This might work but consider this 7789 (item number) will not repeat again in entire file but store number (7345) will definitely occur again in file for other item number. In your solution we are assuming that store record (17345***) will be in 5 lines immediately after matching record of (XYZ*7789*). This may not be the case because the store record I am looking for could be at any line not only within
grep -A5
. It could be after 10 lines or 20 lines or 50 lines. There are no fixed number of store records after line XYZ.
Could you please go through following and let me know if this helps you.
awk '
/^9/ {L = 0 #### If any line starting with 9 then make variable named L's value to 0.
}
$0 ~ "^XYZ.*" PAT1 { #### If any line is starting with XYZ and having PAT1(variable whose value is defined, will mention in next steps)
L = 1 #### Make the value of variable named L to 1.
print #### print the spcific matched line which has matched the condition to TRUE above condition.
}
$0 ~ "^1" PAT2 && L #### If any line starts with 1 and then have PAT2variable whose value is defined, will mention in next steps) and value of L is NOT NULL then no action mentioned which means awk will print that specific line then.
#### awk works on method of condition and action, so kis condition is TRUE and NO action is mentioned then it does the default action which is printing the current line.
' PAT1="7789" PAT2="7345" file #### Mentioning the values of variables PAT1=7789 and PAT2=7345 and mentioning the Input_file as named file.
May I add that L is meant to be a logical (boolean) control variable assuming only the values 1 (i.e. TRUE) and 0 (i.e. FALSE), used to indicate that PAT(tern)1 was found in the actual record, and a line containing PAT2 should be printed.