Select only the lines of a file starting with a field which is matcing a list. awk?

Hello
I have a large file1 which has many events like "2014010420" and following lines under each event that start with text . It has this form:

2014010420 num --- ---  num   ....

NTE           num num --- num...      
EFA           num num ---  num  ...   
LASW         num num ---  num ... 
   
2014010450  num num num  ...                                    
   
STR           num num ---  num ..
KRP           num num --- num ...

2014012940  num num --- num  ... 
 
SARA       num num --- num ...
TREN        num num --- num.......

("-"=space)
and another file2 which is the list of the events that need to be selected, it has the form:

2014010420
2014012450... 

I need using the list of the events at file 2 to select the events and their following lines (only until the next event) of file1. The format must remain the same. thanks in advance for yr help

Please use

```text
 and 
```

tags around any sample data to assist keeping the format untouched.

Is it true that blank lines appear before and after each event line? Are these the only blank lines that appear in the file and can these be used to distinguish the start/end of each event block?

Try this...

awk 'FNR==NR{m[$1];next};/^2/{x=($1 in m)}x' file2 file1
1 Like

Can you explain your code, please ? i don't understand why detect the line which begin with the number 2 (/^2/) and why x ?

1 Like

Sure.

The first part of the code

awk 'FNR==NR{m[$1]++;next}

is reading and storing the contents of $1 from file2 into an array called m. So this will be made up of all the 2014* values.

The second part of the code

/^2/{x=($1 in m)}x'

is what is run for each line in file1. It basically says for each line beginning with a 2 (I made the assumption from the data provided by OP that each value in file2 would start with 2014, so in effect this can be elaborated to /^2014/ OR /^201/), check to see if this 2014* value (ie. $1) is in the array m. Set x to the return code of the 'in array' function which will exit 1 if true and 0 is false. Then pass x outside the action block to perform the default action of printing the line, when x is not 0.

Hope this helps, sorry if I have confused things more!

1 Like

thx for your explanation, but i don't understand how the output print the line between the line which begin by 2

1 Like

x is reassigned every time a line starting with "2" is read, and is set to 1 (="on") if that line's $1 is found in the m array, to 0 (="off") if not.
ALL lines are printed as long as x is 1.

2 Likes

Thank you very much for the reply pilnet101and the comments protocomm but it seems that something is missing. i dont get any output unless I add 1 in the last occurance of x. and i get all the file as output. I would be grateful for helping me 2nd time

Works for me. Please show both your input files. Where do they come from? And, what OS & shell version do you use?

1 Like

I am sorry,I just saw it was my fault :o forgot to add a column in making the list file :o:o It works fine thank you all very much for yr help it is much appreciated, I am very grateful :b: