Joining two files by retaining the headers.

bharathbangalor · November 19, 2013, 5:55am

Hi Experts,

I want to join two files(file1 ,file2) which are having tab separated values,by sorting them on key column(ID) and want to redirect the output to other file(output file) along with the headers from both the files.

cat file1:
ID	Name	phone_no
205	mno	90808
200	xyz	32003
100	abc	55555

cat file2:

ID	city	email
205	bangalore	blr@abc.com
100	delhi	dlh@abc.com

Required output:

ID	Name	phone_no	city	email
205	mno	90808	bangalore	blr@abc.com
100	abc	55555	delhi	dlh@abc.com

Need your help experts

Akshay_Hegde · November 19, 2013, 6:26am

Try :

$ awk 'FNR==NR{A[$1]=$2 FS $3;next}($1 in A){$0=$0 FS A[$1]}A[$1]' file2 file1
ID Name phone_no city email
205 mno 90808 bangalore blr@abc.com
100 abc 55555 delhi dlh@abc.com

Please use codetag not icode

Franklin52 · November 19, 2013, 6:27am

Try:

awk 'NR==FNR{a[$1]=$0; next} $1 in a {print a[$1], $2, $3}' file1 file2

bharathbangalor · November 19, 2013, 6:37am

akshay hegde:

Try :

$ awk 'FNR==NR{A[$1]=$2 FS $3;next}($1 in A){$0=$0 FS A[$1]}A[$1]' file2 file1
ID Name phone_no city email
205 mno 90808 bangalore blr@abc.com
100 abc 55555 delhi dlh@abc.com

Please use codetag not icode

Thanks Akshay:)It worked perfectly:b:
One small doubt, what if file1 contains headers and file2 is without headers. In that case will this command work or will there be any other changes in the command

---------- Post updated at 06:37 AM ---------- Previous update was at 06:36 AM ----------

Thank you Frank

what if file1 contains headers and file2 is without headers. In that case will this command work or will there be any other changes in the command

Akshay_Hegde · November 19, 2013, 6:41am

Then Header will not be printed

bharathbangalor · November 19, 2013, 6:52am

Sorry for asking too many doubts.

The below code works for fine for files with 3 columns. what if it contains unknown number of columns.In that case how to print all the columns

Akshay_Hegde · November 19, 2013, 7:00am

If you just want to print file1 header you can use this

$ awk 'FNR==NR{A[$1]=$2 FS $3;next}FNR==1{print}($1 in A){$0=$0 FS A[$1]}A[$1]' file2 file1
ID Name phone_no 
205 mno 90808 bangalore blr@abc.com
100 abc 55555 delhi dlh@abc.com

if you want to print your own header you can do like this

$ awk 'FNR==NR{A[$1]=$2 FS $3;next}FNR==1{print "My Headers blah blah"}($1 in A){$0=$0 FS A[$1]}A[$1]'  file2  file1
My Headers blah blah
205 mno 90808 bangalore blr@abc.com
100 abc 55555 delhi dlh@abc.com

---------- Post updated at 07:00 AM ---------- Previous update was at 06:54 AM ----------

You can select required column

$0 ---> to print entire line
$1 ---> to print 1st column
$2 ---> to print 2nd column
........................................
$n ---> to print nth column

Franklin52 · November 19, 2013, 7:07am

You can print the header in the BEGIN section and if you have unknown number of columns you can try something like this:

awk 'BEGIN{print "ID Name phone_no city email"} NR==FNR{a[$1]=$0; next} $1 in a {$1=a[$1]}1' file1 file2

bharathbangalor · November 19, 2013, 7:10am

franklin52:

You can print the header in the BEGIN section and if you have unknown number of columns you can try something like this:
awk 'BEGIN{print "ID Name phone_no city email"} NR==FNR{a[$1]=$0; next} $1 in a {$1=a[$1]}1' file1 file2

This works if we have less number of columns, But what if we have around 1000+ columns

Franklin52 · November 19, 2013, 7:23am

Use nawk or /usr/xpg4/bin/awk on Solaris.