awk check for equal - help

uwork72 · February 24, 2009, 7:24am

Input file:

x A 10
y A 10
z A 10
x B 10
y B 12
z B 10
x C 0
y C 0

Required output:

x B 10
y B 12
z B 10

i.e. printing only that section (based on 2nd field) for which third field($3) is not equal for all the lines for that 2nd field. Please help

vgersh99 · February 24, 2009, 7:31am

uwork72:

Input file:
x A 10
y A 10
z A 10
x B 10
y B 12
z B 10
x C 0
y C 0
Required output:
x B 10
y B 12
z B 10
i.e. printing only that section (based on 2nd field) for which third field($3) is not equal for all the lines for that 2nd field. Please help

Hmm..... this is a bit confusing...
why do you have records in red then?

x B 10
y B 12
z B 10

Could you rephrase the description, please.

zaxxon · February 24, 2009, 7:34am

It's easy to produce the desired ouput but I didn't understand your explanaton.

So you want those having $2 == B and also the whole line being unique?
If so...

grep " B " infile| uniq

uwork72 · February 24, 2009, 7:45am

Thanks Zaxxon and vgersh99 for the reply.

I want to only print the section for which 3rd field is different (one section is basically based on second field, here A,B and C ).

x A 10
y A 10
z A 10
x B 10
y B 12
z B 10
x C 0
y C 0

salil2012 · February 24, 2009, 7:58am

To: UnWork72
The replies you have been recived are really for that you have actually asked???

coz i feel you are not specific about "B" in column 2, i mean $2

What i feel is it depends on column 2 but you have to find unmatched in column 3,
m i rite???

if yes.... can we work out n discuss

vgersh99 · February 24, 2009, 8:13am

uwork72:

Thanks Zaxxon and vgersh99 for the reply.

I want to only print the section for which 3rd field is different (one section is basically based on second field, here A,B and C ).
x A 10
y A 10
z A 10
x B 10
y B 12
z B 10
x C 0
y C 0

ok, so what the output of the above?

uwork72 · February 24, 2009, 8:29am

x B 10
y B 12
z B 10

for A, all the third fields are same (10), same for C(0), but for B the 3rd fields are not same, so I wanted to print B section.

salil2012 · February 24, 2009, 9:43am

ok,u got ur ans.

uwork72 · February 24, 2009, 10:44am

Salil, do you have a solution to this ? I am not sure why you are putting unnecessary comments on this post. A solution/help from you is much appreciated

rrk001 · February 24, 2009, 1:12pm

not a very decent solution this. but give it a try.

$ cat file
x a 10
y a 10
z a 11
x b 10
y b 12
z b 10
x c 0
y c 0
x d 1
y d 1
z d 1
w d 2
x e 1
y e 1
z e 1

a,b and d are the records you want printed from "file".

printing out the required "$2"s below....

$ sort -u -k 2,3 file | nawk 'BEGIN {a=$2} {if($2==a && NF > 0){print $2}}{a=$2}'
a
b
d

and this is your solution....

$ for i in `sort -u -k 2,3 file | nawk 'BEGIN {a=$2} {if($2==a && NF > 0){print $2}}{a=$2}'`; do nawk -v var=$i '$2 == var {print $0}' file; done
x a 10
y a 10
z a 11
x b 10
y b 12
z b 10
x d 1
y d 1
z d 1
w d 2

cheers.
rrk001

rrk001 · February 24, 2009, 1:16pm

if you dont want to call nawk for each record that meets the criteria you can print the required "$2"s to a file and fetch them into an array using getline and split.

summer_cherry · February 26, 2009, 4:17am

hi , seems perl is a little bit eaiser to handle

#!/usr/bin/perl
use strict;
my (%hash,%h);
open FH,"<a";
while(<FH>){
  my @tmp=split(" ",$_);
  $hash{$tmp[1]}.=$_;
  $h{$tmp[1]}->{$tmp[2]}=1;
}
close FH;
for my $key (sort keys %hash){
  my @tmp=keys %{$h{$key}};
  print $hash{$key} if $#tmp>=1;
}

rrk001 · February 27, 2009, 5:12am

yes. it really does. compact piece of code there. but, if you still want to use shell scripts here you go.

#! /bin/bash
gawk 'BEGIN{
while (getline < "file") {
  twothree[$2$3]=$2;
}
for (i in twothree) {
  if (twothree in two) {
    filtered[twothree];}
  else {
    two[twothree];}
}}
{if ($2 in filtered) {print $0;}
}' file

cheers!!