prvnrk
July 10, 2008, 10:45am
1
Hi,
I have a text file with 2000 rows and 2000 columns (number of columns might vary from row to row) and "comma" is the delimiter.
In every row, there maybe few duplicates and we need to remove those duplicates and "shift left" the consequent values.
ex:
111 222 111 555
444 999 666 999 777
o/p must be like below:
111 222 555
444 999 666 777
TIA
Prvn
Use nawk or /usr/xpg4/bin/awk on Solaris:
awk -F, '{
for (f=1; f<=NF; f++)
if (!_[$f]++)
printf $f (f != NF ? FS : RS)
split("", _)
}' input
With GNU Awk you can use delete _ instead of split.
nawk -f prv.awk myFile.txt
prv.awk:
{
for(i=1; i<=NF; i++) {
if ($i in arr) continue
printf("%s%s", $i, OFS)
arr[$i]
}
printf ORS
split("", arr)
}
prvnrk
July 10, 2008, 11:19am
4
Thanks for your replies.
Vgersh - Your solution worked (as space is the delimiter). I'm sorry that i did not use "comma" in the example. Actually the delimiter is "comma" as mentioned in the post.
Please advise.
Prvn
With Perl:
perl -F, -lane'$, = ",";
print grep !$_{$_}++, @F;
undef %_' input
prvnrk
July 10, 2008, 1:06pm
7
Thanks radoulov,
Your awk solution worked great!
Prvn