Get values block by block in same file

garvit184 · June 9, 2014, 1:40pm

I have a file say "SAMPLE.txt" with following content,

    P1
    10,9:6/123456
    P2
    blah blah
    P1
    10,9:5/98765
    P2
    blah
    blah
    P1
    blah blah
    P2

I want a output file say "RESULT.txt" as,

    Value1:123456
    Value2:98765
    Value3:NULL

I need to first fetch content between P1 & P2 part then I want to find value of 10,9*/ which I want to save to another value. Incase some P1--P2 block doesn't contain this value I want to save it as "NULL".

How can I code the above in shell/awk ?

I am very new to scripting. Thanks for help.

Akshay_Hegde · June 9, 2014, 1:47pm

Welcome to Forums, try

$ cat file
    P1
    10,9:6/123456
    P2
    blah blah
    P1
    10,9:5/98765
    P2
    blah
    blah
    P1
    blah blah
    P2

$ awk -F'/' 'function isnum(x){return(x==x+0)}/P1/,/P2/{if(!/P1|P2/)print "value"++i,":",isnum($NF)?$NF:"NULL"}' file
value1 : 123456
value2 : 98765
value3 : NULL

bartus11 · June 9, 2014, 1:52pm

Try:

awk -F"\/" '/P1/{p=1}/\//&&p{x=$2}/P2/&&p{print "Value"++i":"((x)?x:"NULL");p=0;x=""}' SAMPLE.txt

Aia · June 9, 2014, 2:02pm

awk -F/ '$1 ~ /P1/ {getline; print "Value" ++c ":", (NF==2?$2:"NULL")}' SAMPLE.txt

garvit184 · June 9, 2014, 2:15pm

akshay hegde:

Welcome to Forums, try

$ cat file
   P1
   10,9:6/123456
   P2
   blah blah
   P1
   10,9:5/98765
   P2
   blah
   blah
   P1
   blah blah
   P2

$ awk -F'/' 'function isnum(x){return(x==x+0)}/P1/,/P2/{if(!/P1|P2/)print "value"++i,":",isnum($NF)?$NF:"NULL"}' file
value1 : 123456
value2 : 98765
value3 : NULL

Hello Akshay,

Thanks for quick response.
Your awk works fine but there are some tweaks that need to be done in order to fit my scenario.

First of all, you awk is reading line by line and putting NULL in every place it doesn't find a 10,9 value which is not the case I am looking for.

In my scenario, 10,9 can occur anywhere inside P1-P2 block along with other data Or there can be a case where it doesn't occur at all.
But if it will occur it will be only once.

Here is a sample input that you can work with.

P1
10,9:11/18013013582
,10:1
,167:0
,487:5/E.164
11,9:15/310410532169026
,10:60
,167:0
12,18:15/013329002130500
P2
1,64:1
,70:H
,97:1
2,1:20140518031625
,2:18000
P3
42,3:1.
,4:3
,300:1.
43,3:1.

P1
11,9:15/310410645017141
,10:1
,167:0
,487:5/E.164
11,9:15/310410645017141
10,9:11/13233361170
P2
1,64:1
,70:H
,97:1

P1
11,9:15/310410645017141
,10:1
,167:0
P2
1,64:1
,70:H
,97:1
2,1:20140518031625
,2:18000
P3
42,3:1.
,4:3
,300:1.
43,3:1.

Thanks again for your help. Appreciated.

Akshay_Hegde · June 9, 2014, 2:19pm

So what would be the expected output for given new input ?

Whether this is what you are looking for ?

$ awk -F'/' 'function isnum(x){return(x==x+0)}/P1/,/P2/{if(!/P1|P2/ && /10,9:.*/)print "value"++i,":",isnum($NF)?$NF:"NULL"}' file
value1 : 18013013582
value2 : 13233361170

garvit184 · June 9, 2014, 2:43pm

akshay hegde:

So what would be the expected output for given new input ?

Whether this is what you are looking for ?
$ awk -F'/' 'function isnum(x){return(x==x+0)}/P1/,/P2/{if(!/P1|P2/ && /10,9:.*/)print "value"++i,":",isnum($NF)?$NF:"NULL"}' file
value1 : 18013013582
value2 : 13233361170

@Akshay, please see my previous post.

Its almost close to what I want, only I think you have not include the condition to print "NULL" in case it doesn't find specified value.

Akshay_Hegde · June 9, 2014, 2:50pm

awk -F'/' '
        
        # Function to validate number
	function isnum(x){return(x==x+0)}

	/P1/,/P2/{
			# Found start increment i reset variables go to next line
			if(/P1/){
				   ++i 
				   s1 = s2 ="" 
				   next
				}

			# Found end validate variable and print go to next line
			if(/P2/){
				 printf "%s %s %d %s\n%s %s %d %s\n\n", "10,9","Value :",i,isnum(s1)?s1:"NULL", \
					 "11,9","Value :",i,isnum(s2)?s2:"NULL"
				  next 
				}

			# Here we search first pattern
			if(!s1 && /10,9:.*/)
				{
					s1 = $NF
				}

			# Here we search second pattern
			if(!s2 && /11,9:.*/)
				{
					s2 = $NF
				}
		  }
	    ' file

Resulting

10,9 Value : 1 18013013582
11,9 Value : 1 310410532169026

10,9 Value : 2 13233361170
11,9 Value : 2 310410645017141

10,9 Value : 3 NULL
11,9 Value : 3 31041064501714

Input

$ cat file
P1
10,9:11/18013013582
,10:1
,167:0
,487:5/E.164
11,9:15/310410532169026
,10:60
,167:0
12,18:15/013329002130500
P2
1,64:1
,70:H
,97:1
2,1:20140518031625
,2:18000
P3
42,3:1.
,4:3
,300:1.
43,3:1.

P1
11,9:15/310410645017141
,10:1
,167:0
,487:5/E.164
11,9:15/310410645017141
10,9:11/13233361170
P2
1,64:1
,70:H
,97:1

P1
11,9:15/310410645017141
,10:1
,167:0
P2
1,64:1
,70:H
,97:1
2,1:20140518031625
,2:18000
P3
42,3:1.
,4:3
,300:1.
43,3:1.

garvit184 · June 9, 2014, 2:52pm

Actually, its my mistake I wasn't clear. My example quoted not correct.
Here is the correct input.

P1
10,9:11/18013013582
,10:1
,167:0
,487:5/E.164
11,9:15/310410532169026
,10:60
,167:0
12,18:15/013329002130500
P2
1,64:1
,70:H
,97:1
2,1:20140518031625
,2:18000
P3
42,3:1.
,4:3
,300:1.
43,3:1.

P1
11,9:15/310410645017141
,10:1
,167:0
,487:5/E.164
10,9:11/13233361170
P2
1,64:1
,70:H
,97:1

P1
11,9:15/310410645017141
,10:1
,167:0
P2
1,64:1
,70:H
,97:1
2,1:20140518031625
,2:18000
P3
42,3:1.
,4:3
,300:1.
43,3:1.

UPDATE: I don't only want to find 10,9 value. I need a script wherein I can find multiple values also from each P1-P2 block.

Inside a P1-P2 block value will occur only once if at all it occurs.

For eg, in above Input I want values for 10,9 as well as 11,9.

Output Required

10,9 Value1:18013013582
11,9 Value1:310410532169026

10,9 Value2:13233361170
11,9 Value2:310410645017141

10,9 Value3:NULL
11,9 Value3:310410645017141

I hope I am clear about my problem. Thanks in advance.

Akshay_Hegde · June 9, 2014, 2:57pm

I can see you edited your previous post, I just posted answer in post #8

garvit184 · June 9, 2014, 2:58pm

[quote=akshay hegde;302905096]

awk -F'/' '
        
        # Function to validate number
	function isnum(x){return(x==x+0)}

	/P1/,/P2/{
			# Found start increment i reset variables go to next line
			if(/P1/){
				   ++i 
				   s1 = s2 ="" 
				   next
				}

			# Found end validate variable and print go to next line
			if(/P2/){
				 printf "%s %s %d %s\n%s %s %d %s\n\n", "10,9","Value :",i,isnum(s1)?s1:"NULL", \
					 "11,9","Value :",i,isnum(s2)?s2:"NULL"
				  next 
				}

			# Here we search first pattern
			if(!s1 && /10,9:.*/)
				{
					s1 = $NF
				}

			# Here we search second pattern
			if(!s2 && /11,9:.*/)
				{
					s2 = $NF
				}
		  }
	    ' file

Resulting

10,9 Value : 1 18013013582
11,9 Value : 1 310410532169026

10,9 Value : 2 13233361170
11,9 Value : 2 310410645017141

10,9 Value : 3 NULL
11,9 Value : 3 31041064501714

Thats Perfect !!
Helps a lot. I can now figure out rest of the code. Thanks a lot.
Will surely update if any help needed.

Aia · June 9, 2014, 2:58pm

@garvit184
Are you learning anything? Are you trying to modify what Akshay Hegde is posting, even commented for you to understand?

You keep changing your mind and input. It would be profitable for you to try on your own a bit and show some effort.

garvit184 · June 9, 2014, 3:16pm

@aia: I agree with you. Actually I am working on something and the part I needed help is just this reporting part. My whole project is mostly in basic shell and AWK is like a black area for me.

Akshay's comments help a lot in understanding the logic used and will help in learning

Don_Cragun · June 9, 2014, 7:23pm

You might also want to try this slightly simpler awk script that seems to more closely match your requested output format:

awk -F/ '
$1 == "P1" {
	s1 = s2 = "NULL"
	n++
}
/^10,9:/ {
	s1 = $2
}
/^11,9:/ {
	s2 = $2
}
$1 == "P2" {
	printf("%s10,9 Value%d:%s\n11,9 Value%d:%s\n", n == 1 ? "" : "\n",
		n, s1, n, s2)
}' file

which (when given the sample input shown in message #9 in this thread) produces:

10,9 Value1:18013013582
11,9 Value1:310410532169026

10,9 Value2:13233361170
11,9 Value2:310410645017141

10,9 Value3:NULL
11,9 Value3:310410645017141

If you understand Akshay's script, I assume you will also understand this script. Let me know if there is anything in this script that you don't understand.

garvit184 · June 10, 2014, 4:56am

@akshay

How can I redirect value of say only one variable s1 into a file say s1.txt.

printf "'%d',",s1 >> s1.txt

But I am getting an error Illegal Statement.

I have used backticks but no success. Can you help me with that ?

Akshay_Hegde · June 10, 2014, 5:12am

garvit184:

@akshay

How can I redirect value of say only one variable s1 into a file say s1.txt.
printf "'%d',",s1 >> s1.txt
But I am getting an error Illegal Statement.

I have used backticks but no success. Can you help me with that ?

printf "%d, ",s1 >>"s1.txt"

Don_Cragun · June 10, 2014, 2:18pm

garvit184:

@akshay

How can I redirect value of say only one variable s1 into a file say s1.txt.
printf "'%d',",s1 >> s1.txt
But I am getting an error Illegal Statement.

I have used backticks but no success. Can you help me with that ?

Please give us some context. Are you giving this command to the shell? If that is what you're doing, you want something like:

printf "'%d'," "$s1" >> s1.txt

Are you using this command in an action clause in an awk script? If that is what you're doing, Akshay already showed you how to do that. (Although he removed the single quotes you seem to want displayed around the numeric quantity that you're printing and added a space to the output after the comma. And if this is part of an awk script that is not read from a file using awk -f awk_script , you can't print single quotes that way unless you play lots of games with the way you give the rest of the script to your shell.)

garvit184 · June 17, 2014, 12:15pm

Sorry for late reply. Was very busy.

This solves my Problem. Thanks

Do you have idea about how can I write a multi-line regular expression ?

like say for example,

17,9:10/8013765024
,10:1
,11:1
23,9:1235455
,11:5

Now I need a regular expression which will help me extract the value for first 11 i.e. 1 and not the value of second 11 i.e. 5.

So, I need to make a regular expression like,

17[,0-9]*[\n]*,11:

But this doesn't work.
I have tried tweaking it but was not successful.
Also, Is there a better way to do this like using awk ?

---------- Post updated at 11:15 AM ---------- Previous update was at 11:14 AM ----------

The basic idea is to find the value of 17..,11: which in the above case is 1 and replace it with another value say 2.