Extract data based on match against one column data from a long list data

patrick87 · November 16, 2009, 4:55am

My input file:
data_5 Ali 422 2.00E-45 102/253 140/253 24
data_3 Abu 202 60.00E-45 12/23 140/23 28
data_1 Ahmad 256 7.00E-45 120/235 140/235 22
data_4 Aman 365 8.00E-45 15/65 140/65 20
data_10 Jones 869 9.00E-45 65/253 140/253 18
data_50 Pink 785 6.00E-45 12/253 140/253 16
data_6 Twins 133 4.00E-45 192/253 140/253 14
data_9 Orange 165 3.00E-45 150/253 140/253 12
data_7 King 258 1.00E-45 184/253 140/253 80

My output file:
data_50 Pink 785 6.00E-45 12/253 140/253 16
data_6 Twins 133 4.00E-45 192/253 140/253 14
data_9 Orange 165 3.00E-45 150/253 140/253 12
data_7 King 258 1.00E-45 184/253 140/253 80

I would like to extract all the content after it match the column 4 is 6.00E-45 or it other way extract the content after it find out the column 4 is 9.00E-45.
Does anybody got good suggestion about how to get this desired output result? I think awk should be able to do this?
Thanks a lot for sharing

thegeek · November 16, 2009, 5:01am

sed -rn '/([^ \t]* ){3}6.00E-45/,$p' t1

t1 is the file name.

patrick87 · November 16, 2009, 5:16am

Hi, I just try the code that you suggested.
It is not worked

sarwan · November 16, 2009, 5:23am

Hi,
Try this ...
egrep '6.00E-45|9.00E-45' hi.txt

the hi.txt should contain your inputs.

Thanks
Sarwan

patrick87 · November 16, 2009, 5:25am

thanks for your suggestion.
It can't work too

sarwan · November 16, 2009, 5:31am

how its not working for you? see the following output.

egrep '6.00E-45|9.00E-45' hi.txt
data_10 Jones 869 9.00E-45 65/253 140/253 18
data_50 Pink 785 6.00E-45 12/253 140/253 16

Franklin52 · November 16, 2009, 5:33am

Assumimg you want the content of the file after the 4th field matches the pattern:

awk '$4=="6.00E-45"{p=1}p' file

thegeek · November 16, 2009, 7:45am

It would be good to say a little more details as,

What is the output while executing it ?
Whether no output or unexpected.. Just paste the output.
What you have tried to troubleshoot in that code ? & what did you got ?

patrick87 · November 16, 2009, 11:38pm

Thanks for helping me again, Franklin52 ^^
In between, do you have any idea how to sort the data of column 4. Then only extract those data that match when column 4 is 6.00E-45?
I got try this code:

sort -r +3 -4 file |  awk '$4=="6.00E-45"{p=1}p'

But it is not worked
Thanks for your advise.

skmdu · November 17, 2009, 1:07am

Is this what you are expecting?

$ cat t
data_5 Ali 422 2.00E-45 102/253 140/253 24
data_3 Abu 202 60.00E-45 12/23 140/23 28
data_1 Ahmad 256 7.00E-45 120/235 140/235 22
data_4 Aman 365 8.00E-45 15/65 140/65 20
data_10 Jones 869 9.00E-45 65/253 140/253 18
data_50 Pink 785 6.00E-45 12/253 140/253 16
data_6 Twins 133 4.00E-45 192/253 140/253 14
data_9 Orange 165 3.00E-45 150/253 140/253 12
data_7 King 258 1.00E-45 184/253 140/253 80

$ sort -n  -k 4 t | sed -n '/6\.00E\-45/,$p'

data_50 Pink 785 6.00E-45 12/253 140/253 16
data_1 Ahmad 256 7.00E-45 120/235 140/235 22
data_4 Aman 365 8.00E-45 15/65 140/65 20
data_10 Jones 869 9.00E-45 65/253 140/253 18
data_3 Abu 202 60.00E-45 12/23 140/23 28

patrick87 · November 17, 2009, 2:56am

Hi, skmdu.
Sad to said that it can't work for sort the data
It will face the problem when facing long data
Do you have better idea or suggestion ?
Thanks a lot.

Franklin52 · November 17, 2009, 3:03am

patrick87:

Thanks for helping me again, Franklin52 ^^
In between, do you have any idea how to sort the data of column 4. Then only extract those data that match when column 4 is 6.00E-45?
I got try this code:
sort -r +3 -4 file |  awk '$4=="6.00E-45"{p=1}p'
But it is not worked
Thanks for your advise.

If you want to sort the data after the selection you can do something like this:

awk '$4=="6.00E-45"{p=1}p' ah | sort -n -k4

patrick87 · November 17, 2009, 3:27am

Hi,

Do you got any idea if I want to sort it first, then only extract?
My column 4 got number like 1e-163, 1e-45, 1.01, etc.
Thanks for advice.