Bashing of 2 files

znesotomayor · March 1, 2016, 2:33am

Hi All,

Seeking for your assistance on how to bash 2 files and then print if the condition met.

Ex.
file1.txt :

Field1   Field2     Field 3        Field 4
usa      <blank>  <blank>     INDIA

file2.txt :

Field1     Field2     Field 3        Field 4
canada    jap        INDIA         utah

Condition
If Field 4 of file1.txt = Field 3 of file2.txt , the value of Field 1(canada) and Field 2(jap) of file2.txt will be put on Field 2 and Field 3 of file1.txt

Expected Output in file1.txt :

Field1   Field2     Field 3        Field 4
usa      canada    jap             INDIA

I used the code below:

awk 'NR==FNR{a[$1];next}$1 in a{print $1, $2}' file1.txt file2.txt

it will print the common string but i don't know how to put the output in file1.txt

Thanks,

RavinderSingh13 · March 1, 2016, 3:12am

Hello znesotomayor,

I am not sure about <blank> <blank> is really like a string into Input_file or it is a space. If <blank> <blank> is a string into file1.txt then following may help you in same.

awk 'FNR==NR && FNR>1{A[$4]=$2 OFS $3;next} ($3 in A){print $1 OFS $2 OFS A[$3]}' file1.txt file2.txt

Output will be as follows.

canada jap <blank> <blank>

Thanks,
R. Singh

znesotomayor · March 1, 2016, 3:18am

Hi Sir RavinderSingh13,

<blank> meaning empty string in Field 2 and Field 3 of file1.txt.

The expected output is:

cat file1.txt
Field1   Field2     Field 3        Field 4
usa      canada    jap             INDIA

Thanks

RavinderSingh13 · March 1, 2016, 3:38am

Hello znesotomayor,

Could you please try following and let me know if this helps you.

awk 'NR==1{;print}FNR==NR && FNR>1{A[$3]=$1 OFS $2;next} ($2 in A){print $1 OFS A[$2] OFS $2}' OFS="\t" file2.txt file1.txt

Output will be as follows.

Field1     Field2     Field 3        Field 4
usa     canada  jap     INDIA

Where Input_file file1.txt looks as follows.

cat file1.txt
Field1   Field2     Field 3        Field 4
usa               INDIA

Thanks,
R. Singh

Don_Cragun · March 1, 2016, 4:12am

znesotomayor:

Hi All,

Seeking for your assistance on how to bash 2 files and then print if the condition met.

Ex.
file1.txt :
Field1   Field2     Field 3        Field 4
usa      <blank>  <blank>     INDIA
file2.txt :
Field1     Field2     Field 3        Field 4
canada    jap        INDIA         utah
Condition
If Field 4 of file1.txt = Field 3 of file2.txt , the value of Field 1(canada) and Field 2(jap) of file2.txt will be put on Field 2 and Field 3 of file1.txt

Expected Output in file1.txt :
Field1   Field2     Field 3        Field 4
usa      canada    jap             INDIA
I used the code below:
awk 'NR==FNR{a[$1];next}$1 in a{print $1, $2}' file1.txt file2.txt
it will print the common string but i don't know how to put the output in file1.txt

Thanks,

Given that fields are separated by an arbitrary number of <space> characters and fields are not aligned, how do you determine that fields 2 and 3 in file1.txt are blank rather than fields 3 and 4 or 2 and 4?

znesotomayor · March 1, 2016, 5:18am

Hi Sir Don,

If it's empty character or tab delimeted. hmm.. Is it possible to put the field 1 and field 2 of file 2 into field 6 and 7? all i want is to put the data of field 1 and field 2 of file2.txt on file1.txt if Field 4 of file1.txt = Field 3 of file2.txt

Please advise,

Thanks,

---------- Post updated at 05:44 PM ---------- Previous update was at 05:33 PM ----------

Hi Sir RavinderSingh13,

Your output is same with file2.txt

canada    jap        INDIA         utah

Thanks,

---------- Post updated at 06:18 PM ---------- Previous update was at 05:44 PM ----------

Any suggestion please. Thank you for the help.

Don_Cragun · March 2, 2016, 1:29am

From your first post in this thread:

To meet your condition, we first have to be able to identify what is in Field 4 of file1.txt . And, when your field separator is a variable number of spaces, there is no way to determine whether the 2nd line in your sample file1.txt contains fields 1 and 2, 1 and 3, 1 and 4, 2 and 3, 2 and 4, or 3 and 4. With its default field separator, the awk utility will assume that that line contains fields 1 and 2 and that fields 3 and 4 are empty strings.

If your fields each contained a fixed number of characters, we could identify fields by character counts; but the data in your sample files does not line up with the headers, so we can't do that.

If your fields used a single <tab> character as a field separator instead of a seemingly random number of <space> characters, we could identify field boundaries easily; but the data in your sample files does not use <tab> characters; it uses a seeming random number of <space> characters.

I repeat: With no way to determine field boundaries in your input data when some fields are empty, there is no way to determine what is in field 4. So if you want to make decisions based on what is in field 4, you are out of luck.

Come up with an unambiguous input file format (and make sure your input files adhere to that format), tell us what that unambiguous input file format is, and show us sample input files in that format and the output you are trying to produce in that format and we may be able to help you come up with a reliable solution to your problem. If you can't do that, there isn't much we can do to help.

znesotomayor · March 2, 2016, 3:38am

Thanks and Noted Sir Don. I will make a new thread with new input file.