Merge two file data together based on specific pattern match

My input:
File_1:
2000_t
g1110.b1
abb.1
2001_t
g1111.b1
abb.2
abb.2
g1112.b1
abb.3
2002_t
.
.

File_2:
2000_t Ali england 135
abb.1 Zoe british 150
2001_t Ali england 305
g1111.b1 Lucy russia 126
abb.2 Zoe british 500
abb.2
g1112.b1 Lucy russia 180
abb.3 Zoe british 700
.
.

My desired output file:
2000_t Ali england 135
g1110.b1
abb.1 Zoe british 150
2001_t Ali england 305
g1111.b1 Lucy russia 126
abb.2 Zoe british 500
abb.2
g1112.b1 Lucy russia 180
abb.3 Zoe british 700
2002_t
.
.

My main purpose is merge both file_1 and file_2 data together. All the file_1 data must be in output file. File_2 data will be append to the output file based on those same data of column 1 in both file. In between, some of the file_1 content might be appear twice. Thanks a lot.

can the key fields (2000_t, g1110.b1, etc.) exist in file1 or file 2 more than once?

Erm...
It just based on the column 1 content of file_1.
So far all the column 1 content of file_1 only appear once and no repeat.
you got any idea to solve my trouble?
thanks first :slight_smile:

---------- Post updated at 04:54 AM ---------- Previous update was at 04:46 AM ----------

Hi, jsmithstl
sad to said that some might be appear twice as well in file_1's column 1.
I want all the file_1 content must exist in the output file. File_2 content just combined with it once match.
Sorry if bring you any inconvenience.

Hi patrick,

i have tried your scenario, pelase find the script and its working for your scenario.

t1 - File_1
t2 - File_2

script:

$more t3.sh
#!/bin/bash
exec<t1
while read line
do
tes=`grep -w $line t2`
if [ -z "$tes" ]
then
echo "$line"
else
echo "$tes"
fi
done

-------------------
o/p

$sh t3.sh
2000_t Ali england 135
g1110.b1
abb.1 Zoe british 150
2001_t Ali england 305
g1111.b1 Lucy russia 126
abb.2 Zoe british 500
g1112.b1 Lucy russia 180
abb.3 Zoe british 700
2002_t

Check if it solves your purpose:

awk 'NR==FNR{a[$0]=$0;next}{a[$1]=$0}END{for (i in a) print a}' f1 f2

hi,

I just try your script.
Unfortunately, it will face problem like this at the output file:
2000_t Ali england 135
g1110.b1
abb.1 Zoe british 150
2001_t Ali england 305
g1111.b1 Lucy russia 126
abb.2 Zoe british 500
abb.2
abb.2 Zoe british 500
abb.2
g1112.b1 Lucy russia 180
abb.3 Zoe british 700
2002_t

abb.2 repeat twice :frowning:
Do I got do any mistakes?

---------- Post updated at 06:34 AM ---------- Previous update was at 06:31 AM ----------

hi,

I just try the awk code that you suggested.
It give the output like this:
2001_t Ali england 305
2002_t
g1110.b1
g1111.b1 Lucy russia 126
g1112.b1 Lucy russia 180
abb.1 Zoe british 150
abb.2
abb.3 Zoe british 700
2000_t Ali england 135

It is a bit different with my desired output result.
Do you know what is the problem causing it?
Thanks ya.

---------- Post updated at 06:47 AM ---------- Previous update was at 06:34 AM ----------

Hi uthay85,
Your script work perfectly if the column 1 at file_1 appear only once.
Do you have any idea if some of the content of column 1 at file_1 appear twice?
Thanks ya.

Does anybody got idea to archive this goal?!
My input:
File_1:
2000_t
g1110.b1
abb.1
2001_t
g1111.b1
abb.2
abb.2
g1112.b1
abb.3
2002_t
.
.

File_2:
2000_t Ali england 135
abb.1 Zoe british 150
2001_t Ali england 305
g1111.b1 Lucy russia 126
abb.2 Zoe british 500
abb.2
g1112.b1 Lucy russia 180
abb.3 Zoe british 700
.
.

My desired output file:
2000_t Ali england 135
g1110.b1
abb.1 Zoe british 150
2001_t Ali england 305
g1111.b1 Lucy russia 126
abb.2 Zoe british 500
abb.2
g1112.b1 Lucy russia 180
abb.3 Zoe british 700
2002_t
.
.