How to mark the row based on col value.?

ken6503 · February 9, 2015, 3:53pm

Hi Gurus,

I have requirement to identify the records based on one column value.
the sample file as below:
ID AMT, AMT1

for each ID, there 2 records, if any one record amt1 is 0, the in 4th col add "no", otherwise, 'yes'. the expected result as below:

 
ID, AMT, AMT1, ind
100,10, 2, yes
100,20, 3, yes
200,30, 0, no
200, 40, 0, no
300, 20, 2, yes
300, 50, 2, yes
400, 20, 1, no
400, 60, 0, no

can anybody give me some input for this?

thanks in advance.

Scrutinizer · February 9, 2015, 3:57pm

Try:

awk '{print $0, $3==0?"no":"yes"}' FS=, OFS=", " file

ken6503 · February 9, 2015, 4:17pm

Hi Scrutinizer,

thanks for your quick reply. the result doesn't look correct. please see below result:

 
100, 20, 1, yes
100, 30, 1, yes
200, 30, 2, yes
200, 30, 2, yes
300, 20, 0, no
300, 20, 0, no
400, 20, 0, no
400, 20, 1, yes

the business rule is "for each ID, there 2 records, if any one record amt1 is 0, the in 4th col add "no", for each ID, there 2 records, if any one record amt1 is 0, the in 4th col add "no""
I expected result is

 
100, 20, 1, yes
100, 30, 1, yes
200, 30, 2, yes
200, 30, 2, yes
300, 20, 0, no
300, 20, 0, no
400, 20, 0, no
400, 20, 1, no

disedorgue · February 9, 2015, 5:22pm

Hi,
If your input file has just data lines (no title):

awk 'NR % 2 {A=$1;B=$2;C=$3;next}{($3==0||C==0)?D="no":D="yes";print A,B,C,D;print $1,$2,$3,D}' FS=, OFS=", " file 
100, 10,  2, yes
100, 20,  3, yes
200, 30,  0, no
200,  40,  0, no
300,  20,  2, yes
300,  50,  2, yes
400,  20,  1, no
400,  60,  0, no

Regards.

RavinderSingh13 · February 10, 2015, 12:13am

Hello Ken6503,

Following may also help in same. This solution has been made according to input provided, like all row's (except first row) first column is coming continously 2 times and if any first column comes in odd number then it will only look for the last column and accordingly print .

1st: With your given example.

awk -F, '{if(NR==1){print $0 OFS "VALUE"} else {A=$NF;B=$0;$0="";getline;if($0 ~ /^$/){E=A!=0?B OFS "yes":"no";print E;E=""} else {Q=$NF!=0 && A!=0 ?B OFS "yes" ORS $0 OFS "yes":B OFS "no" ORS $0 OFS "no";print Q;Q=""}}}' OFS=,  Input_file

Output will be as follows.

ID AMT, AMT1,VALUE
100,10, 2,yes
100,20, 3,yes
200,30, 0,no
200, 40, 0,no
300, 20, 2,yes
300, 50, 2,yes
400, 20, 1,no
400, 60, 0,no

2nd: With my tried example, where I have added one extra row at last.

cat test5
ID AMT, AMT1
100,10, 2
100,20, 3
200,30, 0
200, 40, 0
300, 20, 2
300, 50, 2
400, 20, 1
400, 60, 0
700, 60, 3

Code is as follows.

 awk -F, '{if(NR==1){print $0 OFS "VALUE"} else {A=$NF;B=$0;$0="";getline;if($0 ~ /^$/){E=A!=0?B OFS "yes":"no";print E;E=""} else {Q=$NF!=0 && A!=0 ?B OFS "yes" ORS $0 OFS "yes":B OFS "no" ORS $0 OFS "no";print Q;Q=""}}}' OFS=, test5

Output is as follows.

ID AMT, AMT1,VALUE
100,10, 2,yes
100,20, 3,yes
200,30, 0,no
200, 40, 0,no
300, 20, 2,yes
300, 50, 2,yes
400, 20, 1,no
400, 60, 0,no
700, 60, 3,yes

Hope this helps.

Thanks,
R. Singh

Don_Cragun · February 10, 2015, 1:55am

From post #1 in this thread it isn't clear whether your input file contains a header line or not and it appears that you want a two line header in your output file (with the first header line consisting of a single space character).

Rather than working on pairs of lines and starting counting lines on line 1 or line 2, the following code produces the desired header and then groups all sets of one or more adjacent lines that have three fields and contain the same numeric value in the first field adding a "yes" to each line in the set on output if none of the 3rd fields in the set on input is 0 or an empty string, and adding a "no" to each line in the set on output if one or more of the 3rd fields in the set on input is 0 or an empty string:

#!/bin/ksh
awk '
BEGIN {	printf(" \nID, AMT, AMT1, ind\n")
	FS = "[ ,]+"
	OFS = ", "
}
function pd(	i) {
	for(i = 1; i <= cnt; i++)
		print line, ind
	ind = "yes"
	cnt = 0
}
NF == 3 && ($1 + 0) == $1 {
	if($1 != last)
		pd()
	last = $1
	line[++cnt] = $0
	if($3 == 0)
		ind = "no"
}
END {	pd()
}' sample.txt

If sample.txt contains the text you showed in post #1 in this thread optionally containing one or more blank lines and any number of either, both, or neither of the following header lines:

ID, AMT, AMT1

or:

ID AMT, AMT1

and any number of comment lines (starting with a # or an alphabetic character); it produces the output:

 
ID, AMT, AMT1, ind
100,10, 2, yes
100,20, 3, yes
200,30, 0, no
200, 40, 0, no
300, 20, 2, yes
300, 50, 2, yes
400, 20, 1, no
400, 60, 0, no

as you requested.

If you want to try this on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk , /usr/xpg6bin/awk , or nawk .

This was tested using the Korn shell, but will work with any shell that uses basic Bourne shell syntax.