[Solved] Sorting a column in a file based on a column in a second file

Homa · December 14, 2012, 10:21am

Hello,

I have two files as the following:

File1:

F0100020 A G 
F0100030 A T
F0100040 A G

File2:

F0100040 A G    BTA-28763-no-rs     77.2692
 F0100030 A T    BTA-29334-no-rs     11.4989
 F0100020 A G    BTA-29515-no-rs     127.006

I want to sort the second file based on the first column of the first file and then print in a File 3 this output:

File3:

F0100020 A G    BTA-29515-no-rs     127.006
 F0100030 A T    BTA-29334-no-rs     11.4989
F0100040 A G    BTA-28763-no-rs     77.2692

Thank you very much in advance!

vbe · December 14, 2012, 10:22am

Do you mind showing what you have done so far?

Homa · December 14, 2012, 10:24am

I tried this but it doesn't work:

awk -F" " -vOFS=" " 'FNR==NR{o[$1]=NR; next}{$1=o[$1]" "$1}1' first file second file |sort -k1,1 |cut -d' ' -f2 > new file

in2nix4life · December 14, 2012, 11:16am

awk 'FNR==NR{a[$1]=$0} NR>FNR && ($1 in a) {print a[$1]}' file2 file1 > file3

Homa · December 14, 2012, 11:28am

That works very well, thank you!

Don_Cragun · December 14, 2012, 11:47am

The code provided by in2nix4life is much better, but here is a corrected version of what Homa was trying to do:

awk -F ' ' -v OFS=' ' 'FNR == NR { o[$1] = NR;next }
{ $1 = o[$1]" "$1;print }' File1 File2 | sort -nk1,1 | cut -d' ' -f2- > File3

with the important changes being:

Add a space between -v and OFS . (This is required by awk on OS X; other implementations of awk may work with or without this space.)
Add an n to the sort key. (This is only required if File1 contains more than nine lines.)
Add a - to the end of the cut field list.

Hopefully, seeing the changes will help Homa understand what went wrong.

Homa · December 14, 2012, 11:50am

This was so nice of you to correct my code. Thank you very much for it.