For the given input, Yoda's script produces the desired output. But, when combining F4 and F5 with the current line's $4 and $5, respectively, it will produce duplicate entries in the output list unless the newly added entries are adjacent to the entry with the same value in the previous line. And, the space after a comma is treated as part of a name. The specification given isn't clear if this is intended, but it seemed that the separator in the lists in the sample input was a comma followed by a space rather than just a comma.
As an example, if the last line in the input file is changed from:
far;1;5;[shekhar];[venk, raj]
to:
far;1;5;[shekhar];[raj, venk]
the last line of the output will be:
far;1;8;[shekhar];[venk,raj, venk]
instead of:
far;1;8;[shekhar];[raj, venk]
And, if the following lines are in the input file:
plus;1;2;[u1, u2];[g1,g2]
plus;1;2;[u2];[g1]
it produces:
plus;2;4;[u1, u2,u2];[g1,g2,g1]
while I would have thought the desired output was:
plus;2;4;[u1, u2];[g1, g2]
Yoda's code also assumes that all lines that need to be combined will be adjacent in the input file. That is true in the sample input, but the specification doesn't specify that this will be true.
Here is an alternative awk script that you may want to consider:
awk '
function combine(ins, LOCAL, a, i, j, n, os) {
n = split(ins, a, /, */)
os = a[1]
for(i = 2; i <= n; i++) {
for(j = 1; j < i; j++)
if(a == a[j]) break
if(j >= i) os = os ", " a[j]
}
return os
}
BEGIN { FS = OFS = ";" }
{ if($1 in order) i = order[$1]
else F1[i = order[$1] = ++oc] = $1
F2 += $2
F3 += $3
gsub (/[][]/, "", $4)
if(F4 == "") F4 = $4
else if($4 != "") F4 = combine(F4 "," $4)
gsub (/[][]/, "", $5)
if(F5 == "") F5 = $5
else if($5 != "") F5 = combine(F5 "," $5)
}
END { for(i = 1; i <= oc; i++)
print F1, F2, F3, "[" F4 "]", "[" F5 "]"
}' file
With the input file:
admin;2;0;[yrral];[]
admission;8;0;[timlu];[]
aman;1;0;[ev];[]
caroline;0;4;[];[luis, asethi]
cook;0;4;[];[shekhar, raj]
cook;2;0;[lew];[]
far;0;3;[];[venk]
far;1;5;[shekhar];[raj, venk]
plus;1;2;[u1, u2];[g1,g2]
plus1;1;1;[u1];[]
plus2;0;3;[u2];[raj, venk, g3]
plus2;1;5;[shekhar];[venk, raj]
plus1;1;1;[u2];[g1, g2, g3]
plus1;1;1;[u1];[g2, g4]
plus2;0;3;[u1];[g1, g2, g3]
plus;1;2;[u2];[g1]
this script produces:
admin;2;0;[yrral];[]
admission;8;0;[timlu];[]
aman;1;0;[ev];[]
caroline;0;4;[];[luis, asethi]
cook;2;4;[lew];[shekhar, raj]
far;1;8;[shekhar];[venk, raj]
plus;2;4;[u1, u2];[g1, g2]
plus1;3;3;[u1, u2];[g1, g2, g3, g4]
plus2;1;11;[u2, shekhar, u1];[raj, venk, g3, g1, g2]
With this same input, Yoda's script produces:
admin;2;0;[yrral];[]
admission;8;0;[timlu];[]
aman;1;0;[ev];[]
caroline;0;4;[];[luis, asethi]
cook;2;4;[lew];[shekhar, raj]
far;1;8;[shekhar];[venk,raj, venk]
plus;1;2;[u1, u2];[g1,g2]
plus1;1;1;[u1];[]
plus2;1;8;[u2,shekhar];[raj, venk, g3,venk, raj]
plus1;2;2;[u2,u1];[g1, g2, g3,g2, g4]
plus2;0;3;[u1];[g1, g2, g3]
plus;1;2;[u2];[g1]