Hi All,
let's say an input looks like:
C1,C2,C3,C4,C5,C6,C7,C8,C9,C10,C11
----------------------------------
1|0123452|C501|Z|Z|Z|E|E|E|E|E|E|E
1|0156123|C501|X|X|X|E|E|E|E|E|E|E
1|0178903|C501|Z|Z|Z|E|E|E|E|E|E|E
1|0127896|C501|Z|Z|Z|E|E|E|E|E|E|E
1|0981678|C501|X|X|X|E|E|E|E|E|E|E
Third column having a value like C501. Now, I need to find out when the third column having value like C501,then need to check count of occurrence of value like 'Z' or 'X' or 'Y' value from C4 column to C11 column. Then need to print how many times occurred that particular value in each column.
Output should be like this.
C3,C4,C5,C6,C7,C8,C9,C10,C11
----------------------------------
C501|Z=3|Z=3|Z=3|E=5|E=5|E=5|E=5|E=5|3=5
X=2 X=2 X=2
Kindly help me how to achieve this result.
RudiC
February 8, 2016, 4:38am
2
A bit strange that the header has comma separators but the data have pipes instead.
---------- Post updated at 10:38 ---------- Previous update was at 10:36 ----------
And, your desired output seems somewhat inconsistent...
Apologies!it was typo error.
RudiC
February 8, 2016, 4:39am
4
And, what is 3=5
in the result?
this is my input
C1|C2|C3|C4|C5|C6|C7|C8|C9|C10|C11
-------------------------------------------
1|0123452|C501|Z|Z|Z|E|E|E|E|E|E|E
1|0156123|C501|X|X|X|E|E|E|E|E|E|E
1|0178903|C501|Z|Z|Z|E|E|E|E|E|E|E
1|0127896|C501|Z|Z|Z|E|E|E|E|E|E|E
1|0981678|C501|X|X|X|E|E|E|E|E|E|E
Expected Output:
C3|C4|C5|C6|C7|C8|C9|C10|C11
----------------------------------
C501|Z=3|Z=3|Z=3|E=5|E=5|E=5|E=5|E=5|3=5
X=2 X=2 X=2
Try:
awk '
NR>2 {
A[$3]
for(i=4; i<=NF; i++) {
B[$i]
T[$3,$i,i]++
}
}
END {
for(i in A)
for(j in B) {
s=i
for(k=4; k<=NF; k++)
s=s OFS j "=" ((i,j,k) in T ? T[i,j,k] : 0)
print s
}
}
' FS=\| OFS=\| file
Output:
C501|E=0|E=0|E=0|E=5|E=5|E=5|E=5|E=5|E=5|E=5
C501|X=2|X=2|X=2|X=0|X=0|X=0|X=0|X=0|X=0|X=0
C501|Z=3|Z=3|Z=3|Z=0|Z=0|Z=0|Z=0|Z=0|Z=0|Z=0
--
Note: instead of s=s OFS j "=" ((i,j,k) in T ? T[i,j,k] : 0)
one could use s=s OFS j "=" T[i,j,k]+0
but it uses more memory...
1 Like
Thanks Scrutinizer. Have got the expected result. Thanks again for resolve the issue.