Hi, my dilemna is this:
example i got a file of fruit.txt which contains:
Apple 6
Apple_new 7
old_orange 9
orange 10
Is there any way for me to have an output of
Apple 13
Orange 19
using shell script:
If its duplicate names, it should have been alright but I'm lost as to what can I do in this scenario.
Any help is greatly appreciated.
Is this a home work question?
What is your "real world problem"?
Its related to work. I have a system generated file which contains similar entries as stated above. I need to summarize it to something as I have stated in the output.
My algorithm so far is to grab first entry, and store it in a file. Grab second entry and search the file if the entry already exists, if it does, just grab the number and increment the total number for that entry. If the entry doesnt exists den store it in the file. And so on and so forth.
Issue is entries are not the same, and the pattern is not consistent.
Any ideas??
cheers.
It's gonna be difficult to find a balanced solution, your data is not consistent...
Regards
hmmmm..
ok i went tru again the generated file, there seems to be a pattern..the first 60% of the word is the same, e.g.
Apple_new_101 15
Apple_newMandarin 6
OrangeMango_new 6
OrangeMango_old 5
Algo:
1.grab the first 60% of the name and store it in a temp_name
2.create a file to put the filtered list e.g. filtered.txt
3.search filtered.txt, if temp_name already exists on the file
4. if it does, grab the number then sum it with the existing number on that name.
5.if it doesnt, store the name and number, and proceed with the next entry.
does this sound feasible? its just im not fluid with shell scripts.
thanks
---------- Post updated at 10:36 AM ---------- Previous update was at 09:15 AM ----------
actually thats fine, leave it for now.
i think to make it easier is to have a proper grouping of data and work from there.
thanks for the help.
Something like that?
awk -F'[_| ]' 'NF{a[$1]+=$NF;next}END{for(i in a)print i,a}' file
assume your post is just some sample data, so really up to your criteria base on what to categorize them into one, if simply as 'old' and 'new'. Then maybe below perl script can help you some:
while(<DATA>){
chomp;
my @tmp = split;
$tmp[0]=~s/_?(old|new)_?//;
$hash{$tmp[0]}+=$tmp[1];
}
foreach my $key (keys %hash){
print $key," ",$hash{$key},"\n";
}
__DATA__
Apple 6
Apple_new 7
old_orange 9
orange 10