How to replicate data using Uniq or awk

Hi,

I have this scenario; where there are two classes:- apple and orange.

1,2,3,4,5,6,apple
1,1,0,4,2,3,apple
1,3,3,3,3,4,apple
1,1,1,1,1,1,orange
1,2,3,1,1,1,orange

Basically for apple, i have 3 entries in the file, and for orange, I have 2 entries. Im trying to edit the file and find way to replicate the orange data to make it 3 entries.

Output:-
1,2,3,4,5,6,apple
1,1,0,4,2,3,apple
1,3,3,3,3,4,apple
1,1,1,1,1,1,orange
1,2,3,1,1,1,orange
1,1,1,1,1,1,orange

This would make it balance for both number of line contains apple and orange.
I have tried using Uniq but cant figure out further from that.

Please advise. THanks.

How do you decide which "orange" line to duplicate? Is it always the first one?

Will it always be 3 and 2, or do those quantities vary? Is there other data in the file as well, or is that everything in the file?

you mean you wanna replicate line 4 as line 6??
if so use...
head -4|tail -1 filename >> filename
this appends line 4 as line 6 in you file..

Hi,

How do you decide which "orange" line to duplicate? Is it always the first one?
> It is always taking from the first one.
E.g if the data have

1,2,3,4,5,6,apple
1,1,0,4,2,3,apple
1,3,3,3,3,4,apple
1,1,0,4,2,3,apple
1,3,3,3,3,4,apple
1,1,1,1,1,1,orange
1,2,3,1,1,1,orange

So, it will have repeated of orange dataset from the first occurrence of orange until it fulfill the similar number of items of orange as apple:-
1,2,3,4,5,6,apple
1,1,0,4,2,3,apple
1,3,3,3,3,4,apple
1,1,0,4,2,3,apple
1,3,3,3,3,4,apple
1,1,1,1,1,1,orange
1,2,3,1,1,1,orange
1,1,1,1,1,1,orange
1,2,3,1,1,1,orange
1,1,1,1,1,1,orange

Will it always be 3 and 2, or do those quantities vary? Is there other data in the file as well, or is that everything in the file?
>The number can be 0,1,...100,.all the integers but no negative numbers.
The real data in the file contains more than 6 numbers with ",". There can be up to hundreds of numbers with ",". But i think it would be similar case handle using this small data example?

Thanks.

Try this:

awk -F, '
        /apple/ { applecount++ }
        /orange/ { orangedata[++orangecount]=$0 }
        1 # print the line
        END {
                for (i=orangecount;i<applecount;i++) {
                        print orangedata[(i%orangecount)+1]
                }
        }
'

try below perl script

sub RepeatArray{
	$ref=shift;
	@arr=@$ref;
	$num=shift;
	$len=$#arr+1;
	for($i=$len;$i<$num;$i++){
		$arr[$i]=$arr[$i%$len];
	}
	return \@arr;
}
$file=shift;
open(FH,"<$file");
while(<FH>){
	@arr=split(",",$_);
	$temp=$arr[$#arr];
	$_=~tr/\n//d;
	if($hash{$temp}){
		$hash{$temp}=sprintf("%s/%s",$hash{$temp},$_);
	}
	else{
		$hash{$temp}=$_;
	}
	$h{$arr[$#arr]}++;
}
close(FH);
@sum=sort {$b<=>$a;} values %h;
$max=$sum[0];
for $key (keys %hash){
	@arr=split("/",$hash{$key});
	$ref=RepeatArray(\@arr,$max);
	@res=@$ref;
	for($i=0;$i<=$#res;$i++){
		print $res[$i],"\n";
	}
}