concatenate all duplicate line in a file.

If you have DOS carriage returns in the file, that could perhaps also explain the breaks.

TIMTOWTDI...

awk -F\| '{
   if (k[$1])
       k[$1] = sprintf("%s|%s|%s", k[$1], $2, $3)
   else
       k[$1] = sprintf("%s|%s|%s", $1, $2, $3)
} END {
   for (i in k)
       print k
}' file

I am very much new to shell script , I am facing a big problem in parsing a flat file.

My input file format is just like bellow

794051400123|COM|21|0|BD|R|99.98
794051413727|COM|11|0|BD|R|28.99
794051415622|COM|22|0|BD|R|28.99
883929004676|COM|33|0|BD|R|28.99
794051400123|MOM|24|0|BD|R|99.98
794051413727|MOM|11|0|BD|R|28.99
794051415622|MOM|23|0|BD|R|28.99
883929004676|MOM|01|0|BD|R|28.99
794051400123|RNO|50|0|BD|R|99.98

Currently the file contains duplicate first field.
What I want is that first field should be unique for each line which
will contain the other field as well.

My desired output file format is just like bellow

794051400123,BD,R,99.98,COM=21,MOM=24,RNO=50

I am using the following piece of code for the purpose

awk -F\| '{
if (k[$1])
k[$1] = sprintf("%s,%s=%s", k[$1],$2,$3)
else
k[$1] = sprintf("%s,%s=%s", $1,$2,$3)
} END {
for (i in k)
print k
[i]}' input.txt > out.txt

exit 0

but it is not working for me.Please help me.

The script only captures the first, second, and third fields. You need to copy all the fields you want in the output.

awk -F '|' '{ k[$1] = (k[$1] ? k[$1] : $1 "," $4 "," $5 "," $6 "," $7) "," $2 "=" $3 }
END { for (i in k) print k }' file

era's code works too though it has one extra field in it.

awk -F\| '{
   if (k[$1])
       k[$1] = sprintf("%s,%s=%s",k[$1],$2,$3)
   else
       k[$1] = sprintf("%s,%s,%s,%s,%s=%s",$1,$5,$6,$7,$2,$3)
} END {
   for (i in k) print k
}' file

I have changed my code

awk -F\| '{
if (k[$1])
k[$1] = sprintf("%s,%s,%s,%s,%s=%s",k[$1],$5,$6,$7,$2,$3)
else
k[$1] = sprintf("%s,%s,%s,%s,%s=%s", $1,$5,$6,$7,$2,$3)
} END {
for (i in k)
print k
[i]}'

I have got the out put

794051400123,BD,R,99.98^M,COM=21,BD,R,99.98^M,MOM=24

Also some junk char comes after 14.99

I told you before you should check for DOS carriage returns. Those are the reason the earlier script didn't work correctly.

Shamrock's code was correct for the k[$1] case, you added too many fields there. You should only be adding $2 and $3 from the duplicate lines.

Are you going back and forth between UNIX and DOS?? The Ctrl-M (^M) characters in your output are indicative of mixing UNIX and DOS.

Can u give me any idea how can i remove Ctrl-M (^M) from my out put file.

Search these forums, the question gets asked every day. A good search term is dos2unix -- you might even have a command with that name installed on your system. If not, searching for it will bring up threads by others who didn't have it either and wanted a different solution.

Thanks!!!
finally i have solved my problem....