Input
NJ090237_0263_GRP,NJ090237_0263_VIEW,NJ090237_0263_PSGRP,NJ090237_0263_GOLD_CSGRP,06E:0_08E:0_09E:0_11E:0,0CE5
NJ090237_0264_GRP,NJ090237_0263_VIEW,NJ090237_0264_PSGRP,NJ090237_0263_GOLD_CSGRP,06E:0_08E:0_09E:0_11E:0,0CE5
NJ090233_0263_GRP,NJ090233_0263_VIEW,NJ090233_0263_PSGRP,NJ090233_0263_GOLD_CSGRP,06E:0_08E:0_09E:0_11E:0,0CE6
NJ090233_0263_GRP,NJ090233_0263_VIEW,NJ090233_0264_PSGRP,NJ090233_0263_GOLD_CSGRP,06E:0_08E:0_09E:0_11E:0,0CE6
Basically when column 6 is same in the input file , combine $1,$2,$3,$4,$5 using a ";"
but if the value in any of the columns 1-5 is similar , just use unique value .
gawk '
{
i=$6
p=(i in A)
}
NR==FNR {
A=A (p?";":x) $1
B=B (p?";":x) $2
C=C (p?";":x) $3
D=D (p?";":x) $4
E=E (p?";":x) $5
next
}
p {
$1=A
$2=B
$3=C
$4=D
$5=E
delete A
delete B
delete C
delete D
delete E
print
}
' FS=, OFS=, input1 input1
I am getting this output .. its combining unique values also
NJ090237_0263_GRP;NJ090237_0264_GRP,NJ090237_0263_VIEW;NJ090237_0263_VIEW,NJ090237_0263_PSGRP;NJ090237_0264_PSGRP,NJ090237_0263_GOLD_CSGRP;NJ090237_0263_GOLD_CSGRP,06E:0_08E:0_09E:0_11E:0;06E:0_08E:0_09E:0_11E:0,0CE5
NJ090233_0263_GRP;NJ090233_0263_GRP,NJ090233_0263_VIEW;NJ090233_0263_VIEW,NJ090233_0263_PSGRP;NJ090233_0264_PSGRP,NJ090233_0263_GOLD_CSGRP;NJ090233_0263_GOLD_CSGRP,06E:0_08E:0_09E:0_11E:0;06E:0_08E:0_09E:0_11E:0,0CE6
but output needed is
NJ090237_0263_GRP;NJ090237_0264_GRP,NJ090237_0263_VIEW,NJ090237_0263_PSGRP;NJ090237_0264_PSGRP,NJ090237_0263_GOLD_CSGRP,06E:0_08E:0_09E:0_11E:0,0CE5
NJ090233_0263_GRP,NJ090233_0263_VIEW,NJ090233_0263_PSGRP;NJ090233_0264_PSGRP,NJ090233_0263_GOLD_CSGRP,06E:0_08E:0_09E:0_11E:0,0CE6