I have of two space separated files:
==> File1 <==
PT|np_496075.1 st|K92748.1 st|K89648.1 PT|np_001300561.1
PT|np_497284.1 st|K90752.1 st|K90279.1 PT|np_740775.1
PT|np_497749.1 st|K90752.1 st|K92038.1 PT|np_490856.1
PT|np_497284.1 st|K90752.1 st|K88095.1 PT|np_494764.1
==> File 2 <==
PT|np_490856.1 171608
PT|np_740775.1 171647
PT|np_497749.1 171647
PT|np_494764.1 171646
PT|np_497284.1 171648
PT|np_496075.1 171200
PT|np_001300561.1 171202
I want to map the values for first and fourth column of File1 (begin with PT|np_xxxx) from column 2 of File2. The File2 contains the values of PT|np_xxxx.
The desired output is (mapped values of column1 from file2, column1 file1, column2 file1, column3 file1, column4 file1, mapped values of column4 from file2)
==> File 3 <==
171200 PT|np_496075.1 st|K92748.1 st|K89648.1 PT|np_001300561.1 171202
171648 PT|np_497284.1 st|K90752.1 st|K90279.1 PT|np_740775.1 171647
171647 PT|np_497749.1 st|K90752.1 st|K92038.1 PT|np_490856.1 171608
171648 PT|np_497284.1 st|K90752.1 st|K88095.1 PT|np_494764.1 171646
171648 PT|np_497284.1 st|K90752.1 st|K88689.1 PT|np_497749.1 171647
sammy777888:
I have of two space separated files:
==> File1 <==
PT|np_496075.1 st|K92748.1 st|K89648.1 PT|np_001300561.1
PT|np_497284.1 st|K90752.1 st|K90279.1 PT|np_740775.1
PT|np_497749.1 st|K90752.1 st|K92038.1 PT|np_490856.1
PT|np_497284.1 st|K90752.1 st|K88095.1 PT|np_494764.1
==> File 2 <==
PT|np_490856.1 171608
PT|np_740775.1 171647
PT|np_497749.1 171647
PT|np_494764.1 171646
PT|np_497284.1 171648
PT|np_496075.1 171200
PT|np_001300561.1 171202
I want to map the values for first and fourth column of File1 (begin with PT|np_xxxx) from column 2 of File2. The File2 contains the values of PT|np_xxxx.
The desired output is (mapped values of column1 from file2, column1 file1, column2 file1, column3 file1, column4 file1, mapped values of column4 from file2)
==> File 3 <==
171200 PT|np_496075.1 st|K92748.1 st|K89648.1 PT|np_001300561.1 171202
171648 PT|np_497284.1 st|K90752.1 st|K90279.1 PT|np_740775.1 171647
171647 PT|np_497749.1 st|K90752.1 st|K92038.1 PT|np_490856.1 171608
171648 PT|np_497284.1 st|K90752.1 st|K88095.1 PT|np_494764.1 171646
171648 PT|np_497284.1 st|K90752.1 st|K88689.1 PT|np_497749.1 171647
Hello sammy777888,
Not sure how above highlighted line has come because I can't see any relation of line PT|np_001300561.1 171202
from Input_file2 to Input_file1. If this is a typo then following may help you in same.
awk -F'[| ]' 'FNR==NR{a[$2]=$0;b[$2]=$NF;next} ($2 in a){print b[$2] "\t" $0 "\t" b[$2];}' OFS="|" Input_file2 Input_file1
Output will be as follows.
171200 PT|np_496075.1 st|K92748.1 st|K89648.1 PT|np_001300561.1 171200
171648 PT|np_497284.1 st|K90752.1 st|K90279.1 PT|np_740775.1 171648
171647 PT|np_497749.1 st|K90752.1 st|K92038.1 PT|np_490856.1 171647
171648 PT|np_497284.1 st|K90752.1 st|K88095.1 PT|np_494764.1 171648
If I missed something I apologies for it. If above is as per your expectations then well and good else please show us more clear Input_files with more clear conditions in code tags, so that we could try to help you in same. Have a great weekend ahead.
Thanks,
R. Singh
1 Like
Try:
awk 'NR==FNR{A[$1]=$2; next} {print A[$1], $0, A[$NF]}' file2 file1
Output:
171200 PT|np_496075.1 st|K92748.1 st|K89648.1 PT|np_001300561.1 171202
171648 PT|np_497284.1 st|K90752.1 st|K90279.1 PT|np_740775.1 171647
171647 PT|np_497749.1 st|K90752.1 st|K92038.1 PT|np_490856.1 171608
171648 PT|np_497284.1 st|K90752.1 st|K88095.1 PT|np_494764.1 171646
1 Like