I would like to delete duplicated chunks of strings on the same row(?).
One chunk is comprised of four lines such as:
path name
starting point
ending point
voltage number
I would like to delete duplicated chunks on the same row(?) if "ending point" is duplicated.
For example, ending points of the first and the second chunk are same in the first row and I would like to only keep the first chunk. Therefore, the second chunk is removed on the first row.
In the second row, ending points of the first and the third chunk are same and keep the first chunk.
Actually, I have posted the same question on other website to get a help, and somebody posted replies, but did not work correctly. Any help is appreciated.
Instead of trying to get multiple websites to act as your unpaid programming staff, why don't you show us how you have tried to solved this problem on your own? If you can show us what you have tried, maybe we can help you fix it.
We have helped you with 8 other awk scripts in the last six months. Can't you use the examples provided by those scripts to get a good start on what you need here?
Save as chunks.pl
Run as perl chunks.pl chunks.data
#!/usr/bin/perl
#
use strict;
use warnings;
my @chunks;
my $lines = 0;
while(<>){
my @parts = split;
push @{$chunks[$lines++]}, @parts;
if ($lines == 4) {
my %seen;
my $count = 0;
my @keep;
for my $i (@{$chunks[2]}) {
!$seen{$i}++ and push @keep, $count;
++$count;
}
for my $i (@chunks){
my @returns;
for my $j (@keep) {
push @returns, @{$i}[$j];
}
print "@returns\n";
}
clean();
}
}
sub clean {
@chunks = ();
$lines = 0;
}
awk '
{ for(i = 1; i <= NF; i++) {
f[NR % 4, i] = $i
}
}
!(NR % 4) {
ocnt = 0
for(i = 1; i <= NF; i++)
if(!(f[3, i] in of)) {
of[f[3, i]]
spot[++ocnt] = i
}
for(i = 1; i <= ocnt; i++)
for(j = 1; j <= 4; j++) {
ol[j] = ol[j] f[j % 4, spot] ((i == ocnt) ? "" : "\t")
}
for(i = 1; i <= 4; i++) {
print ol
delete ol
}
for(i in of)
delete of
}' input.txt
would work better for you. This uses a single tab character as the output field separator instead of a seemingly random number of spaces (but you can easily change it to a fixed number of spaces if you want to).
From the sed command you're using, I assume that you're not running this on a Solaris system, but if someone else wants to try the above code on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk or nawk .