[Solved] I need help with a text file.

johankor · October 20, 2011, 5:55am

Hello everyone, I need to write a shell script for a file consisting of 3 columns, first column is frequency the second one is power and the last one is number of occurence. I basically need to get the power and the frequency corresponding to the highest number of occurrence number. Below is the part of my file.

 0.5000  -142   8 0.5000  -143   19 0.5000  -144   14 0.5000  -145   102 0.5000  -146   122 0.5000  -147   106 0.5452  -96     13 0.5452  -97     14 0.5452  -105   3 0.5452  -110   3 .... .... 16.000 -162   87 16.000 -163   115 16.000 -164   151 16.000 -165  77 .....

So at the end I would like to get something like this:

 0.5000 -146 122 0.5452  -97  14 .... 16.000 -164 151 .....

Can you help me out with this problem?

---------- Post updated at 12:55 PM ---------- Previous update was at 12:46 PM ----------

Sorry somehow the text is not aligned correctly
Here is the sample file

0.5000  -142   8 
0.5000  -143   19 
0.5000  -144   14 
0.5000  -145   102 
0.5000  -146   122 
0.5000  -147   106 
0.5452  -96     13 
0.5452  -97     14 
0.5452  -105   3 
0.5452  -110   3 
.... .... 
16.000 -162   87 
16.000 -163   115 
16.000 -164   151 
16.000 -165  77 
.....

And the the file I want is;

0.5000 -146 122 
0.5452  -97  14 
.... 
16.000 -164 151 
.....

munkeHoller · October 20, 2011, 6:10am

please restate you requirements, they are unclear.
does your file look like that below, or its is a single continuous stream unbroken by newline as per your example.
if it looks like the below, does your statement

which column is power, frequency,occurrence ...

....
0.5000 -142 8
0.5000 -143 19
0.5000 -144 14
0.5000 -145 102
0.5000 -146 122
0.5000 -147 106
0.5452 -96 13
0.5452 -97 14
0.5452 -105 3
0.5452 -110 3
.... ....
16.000 -162 87
16.000 -163 115
16.000 -164 151
16.000 -165 77
.....

---------- Post updated at 11:10 AM ---------- Previous update was at 11:05 AM ----------

hmmm

given your request and the following dataset

i would expect

0.5000 -147 122
0.5452 -110 14
16.000 -165 151

not

so, please restate to avoid ambiguity

johankor · October 20, 2011, 6:42am

Thank you for your reply, It is my first post in this forum so I made some mistakes writing the text file. The file is not a single continuous stream. It is broken by newlines, it has 3 columns; the first column is frequency, the second is the power and the third one is the Number of repeat value of that power in the second column. I would like to obtain for each frequency the maximum repeated power value. So in my file for 0.5000 frequency the maximum repeated power is -146 since it has repeated 122 times(3. column).

So my new file should look like this;

0.5000 -146 122
0.5452 -97 14
16.000 -164 151

I hope this explains better.

jayan_jay · October 20, 2011, 6:47am

for i in `nawk '{print $1|"sort -u"}' infile`
do 
     grep "^$i" infile | nawk '$3 > max {max=$3; maxline=$0}; END{ print maxline}'
done
0.5000  -146   122
0.5452  -97     14
16.000 -164   151

johankor · October 20, 2011, 7:53am

Thank you very much jayan_jay. It worked great

gvj · October 20, 2011, 8:09am

I have to do opposite of what you want Convert row to column

zaxxon · October 20, 2011, 8:11am

@gvj
No need to advertise in other threads, closing thread.