merging

Lucky_Ali · July 30, 2010, 2:27pm

Hi all,

I have 2 files. I want to merge a portion or column in file 2 into file 1.

file 1 - not tab or space delimited

B_1       gihgjfhdj| hgfkddlldjljldjlddl
B_2       gihgjddshjgfhs| hgfkddlldjljldjlddl
B_3       gihgjfhdj| hgfkddlldjljldjlddlhgjdhdhjdhjhdjhdjhgdj

file2 - tab delimited

asdfg     1101    2579    +    492    229194403    0    -    be_10    -
D-asd     2669    4000    +    443    229194404    0    -    be_20    -
Perd    4162    5049    +    295    229194405    0    -    be_30    -

I need the output file where the 9th column or tab on file 2 is merged to file 1
output file (modified file1)

B_1       gihgjfhdj| hgfkddlldjljldjlddl  be_10
B_2       gihgjddshjgfhs| hgfkddlldjljldjlddl be_20
B_3       gihgjfhdj| hgfkddlldjljldjlddlhgjdhdhjdhjhdjhdjhgdj be_20

Please let me know the best way to do in unix or awk or sed.

LA

tukuyomi · July 30, 2010, 2:42pm

In sh...

#!/bin/sh

cut -f9 f2 > tmpfile
paste f1 tmpfile

exit 0

f1 and f2 are respectively your first and second files
Waiting for an awk solution

Christoph_Spohr · July 30, 2010, 3:17pm

Hi,

or with awk:

awk 'NR==FNR{a[++i]=$9}NR!=FNR{t=$1;sub(/B_/,"",t);print $0 FS a[t]}' file2 file1

Output:

B_1       gihgjfhdj| hgfkddlldjljldjlddl be_10
B_2       gihgjddshjgfhs| hgfkddlldjljldjlddl be_20
B_3       gihgjfhdj| hgfkddlldjljldjlddlhgjdhdhjdhjhdjhdjhgdj be_30

HTH Chris

Lucky_Ali · July 30, 2010, 3:37pm

Thanks Chris,
It worked for that particular files. I guess u were targeting /B_/. but I have many files that have different ids. So is there a way to make it a generalized code so that whatever the id is, the code works for any data with the same structure and format.

let me know.
LA

Christoph_Spohr · July 30, 2010, 4:34pm

Yupp,

try this:

awk '{getline s < "file2"; split(s,a,"[ \t]+");print $0 FS a[9]}' file1

Lucky_Ali · July 30, 2010, 7:25pm

It didn't work

Christoph_Spohr · July 31, 2010, 2:03am

For me it does work. So what exactly doesn't work for you.

ygemici · July 31, 2010, 1:11pm

# paste file1 file2 | awk '{print $1,$2,$3,$12}' | sed 's/ /\t/'
B_1     gihgjfhdj| hgfkddlldjljldjlddl be_10
B_2     gihgjddshjgfhs| hgfkddlldjljldjlddl be_20
B_3     gihgjfhdj| hgfkddlldjljldjlddlhgjdhdhjdhjhdjhdjhgdj be_30