Hello,
I have a file like this..
2 168611167 STK39 STK39 --- 27347 "serine threonine kinase 39 (STE20/SPS1 homolog, yeast)" YES SNP_A-2086192 rs16854601 0.001558882
6 13670256 SIRT5 /// RPS4X SIRT5 --- 23408 /// 6191 "sirtuin (silent mating type information regulation 2 homolog) 5 (S. cerevisiae) /// ribosomal protein S4, X-linked" YES SNP_A-8405097 rs16874223 0.00156082
2 105439878 NCK2 /// FHL2 FHL2 /// NCK2 --- 8440 /// 2274 NCK adaptor protein 2 /// four and a half LIM domains 2 --- SNP_A-2034891 rs41322544 0.001562043
12 80373503 PPFIA2 PPFIA2 --- 8499 "protein tyrosine phosphatase, receptor type, f polypeptide (PTPRF), interacting protein (liprin), alpha 2" YES SNP_A-8542673 rs17008588 0.001565901
15 41547066 TP53BP1 /// TP53BP1 /// TP53BP1 TP53BP1 --- 7158 /// 7158 /// 7158 tumor protein p53 binding protein 1 /// tumor protein p53 binding protein 1 /// tumor protein p53 binding protein 1 YES SNP_A-1782700 rs1814538 0.001573326
I need to sort this file ascending on the last column.
Then, I need an output with two columns.
First col. with only the words that start with SNP_A and the the next column with the word found in the right of the column with SNP_A.
e.g output
SNP_A-2086192 rs16854601
SNP_A-8405097 rs16874223
Can you show me howto with awk?
Thanks for reading