Move a block of lines to file if string found in the block.

grep_me · October 31, 2012, 9:49am

I have a "main" file which has blocks of data for each user defined by tags BEGIN and END .

BEGIN
ID_NUM:24879
USER:abc123 
HOW:47M
CMD1:xyz1
CMD2:arp2
STATE:active
PROCESS:id60
END
BEGIN
ID_NUM:24880
USER:def123 
HOW:4M
CMD1:xyz1
CMD2:xyz2
STATE:running
PROCESS:id64
END
BEGIN
ID_NUM:24881
USER:def123 
HOW:8M
CMD1:xyz1
CMD2:xyz3
STATE:inactive
PROCESS:id77
END
BEGIN
ID_NUM:24882
USER:abc123 
HOW:87M
CMD1:xyz1
CMD2:xyz4
CMD2:xyz3
STATE:running
PROCESS:id99
END

I have another file with just user id's in there which needs to be filtered.
For example, this file has 3 id's

abc123
def123
ghi123

I want to create separate files for each user with the blocks of data inside that main file.
For example, since the first two users are in the main file, I want to create 2 separate files with "blocks" of data related to just that specific user in each file.
Please note that the number of lines that can be in a "block"(lines between BEGIN and END tags) vary for each user.

Any help or ideas in creating this script would be much appreciated.

Thanks

pamu · October 31, 2012, 10:12am

try

awk -F "[: ]" 'FNR==NR{if($0 ~ /BEGIN/){a[PD]=$0}else if($0 ~ /END/){Y=a[PD]"\n"$0}else{a[PD]=a[PD]"\n"$0;if($0 ~ /USER/){s=$2}};next}
{if(Y[$0]{print Y[$0] > $0".txt"}}' main_file file2

It will create two files.

abc123.txt and def123.txt

elixir_sinari · October 31, 2012, 10:14am

awk 'FNR==NR{a[$1];next}
/BEGIN/{s=1}
s{t=t?t RS $0:$0}
/USER/ && s{
sub(/[ \t]*$/,"",$2)
if($2 in a) found=$2
else {s=0;t=""}
}
/END/ && s && length(found){
f = found ".txt"
print t >> f
close(f)
s=0;t=""}' lookup_file FS=: main_file

grep_me · November 2, 2012, 3:41pm

Hi Pamu, Your code did not work for me. I ran your code from command prompt, and it gave me an error. (I executed your code using the same exact files above)

 
awk: syntax error near line 1
awk: bailing out near line 1

And Elixir, How am I supposed to run your code? put it in a wrapper and run? or run from command prompt?

Thanks Again for your time.

Yoda · November 2, 2012, 4:17pm

You can run it both ways:-

Put it in a file, give execute permission and run.
Copy & Paste it to your command prompt and run.

Note: just make sure you have your files named as lookup_file & main_file like elixir_sinari used or change them in code as per your choice.

grep_me · November 8, 2012, 5:33pm

Elixir,
Your method did not work either. I got the same error.
Here is my uname o/p just in case:
SunOS c1dupep4 5.10 Generic_147440-25 sun4v sparc SUNW,SPARC-Enterprise-T5120

vgersh99 · November 8, 2012, 5:42pm

use 'nawk' instead of 'awk'

grep_me · November 9, 2012, 11:29am

Thanks. It worked using nawk.
But I have a scenario where there can be spaces in the userid/name field.
Is there anyway I can include the whole word(s) to match the pattern.
example: the username(lookup file) can have data like this:

abc123
John Smith
def123

Your code is skipping the "John Smith" part and outputting the name "John Appeas" (First John in the list), which is unwanted. and not getting me the "John Smith" (one of the last John's)

Thanks