Comparing two files

I have tywo files as usual

file1.txt
$1 $2 $3 $4 $5 $6 $7 $8

1234567|iufgt|iuoy|iout|white |black |red        |90879
1234567|iufgt|iuoy|iout|green |pink  |blue       |90879
1234567|iufgt|iuoy|iout|orange|purple|magenta|90879
1234567|iufgt|iuoy|iout|yellow |violet|grey      |90879

we have to consider here $5 , $6 , $7 for our search

file2.txt

$1 $2 $3 $4 $5 $6

1234567|grey|iuoy|iout|iufgt|90879
1239877|magenta|iuoy|iout|iufgt|90879
1733267|blue|iuoy|iout|iufgt|90879
1232677|red|iuoy|iout|iufgt|90879
1239567|white|iuoy|iout|iufgt|90879
1238727|green|iuoy|iout|iufgt|90879
1237247|orange|iuoy|iout|iufgt|90879
1236397|yellow|iuoy|iout|iufgt|90879
1232947|pink|iuoy|iout|iufgt|90879
1230247|black|iuoy|iout|iufgt|90879
1234037|violet|iuoy|iout|iufgt|90879
1238037|purple|iuoy|iout|iufgt|90879
1237897||iuoy|iout|iufgt|90879
1238797||iuoy|iout|iufgt|90879
1239997||iuoy|iout|iufgt|90879

here we should take only $2 for comparison. As you can most of the $2 field records has value and some do not have value

Question:

I want to take the fields $5 , $6 , $7 from file 1 and compare it with $2 field from file 2. and the rsult should be like this

if $5 (file1) =$2 (file2) then replace $5 (file1) with $1 of (file2)
if $6 (file1) =$2 (file2) then replace $6 (file1) with $1 of (file2)
if $7 (file1) =$2 (file2) then replace $7 (file1) with $1 of (file2)

the final output will look like this

Actual file1.txt (before running the code)

$1 $2 $3 $4 $5 $6 $7 $8

1234567|iufgt|iuoy|iout|white |black |red    |90879
1234567|iufgt|iuoy|iout|green |pink  |blue   |90879
1234567|iufgt|iuoy|iout|orange|purple|magenta|90879
1234567|iufgt|iuoy|iout|yellow|violet|grey   |90879

FIle1.txt after running the above said condition

$1 $2 $3 $4 $5 $6 $7 $8

1234567|iufgt|iuoy|iout|1239567 |1230247 |1232677   |90879
1234567|iufgt|iuoy|iout|1238727 |1232947 |1733267   |90879
1234567|iufgt|iuoy|iout|1237247 |1238037 |1239877   |90879
1234567|iufgt|iuoy|iout|1236397 |1234037 |1234567   |90879

so the field $5 , $6 , $7 should get replaced from the matched valued of $1(file1)

Please advice how it can do done. if u can do it in join , awk or nawk it would be really helpfull.

awk -F\| 'NR == FNR {
  map[$2] = $1 
  next
  }
{
  for (i = 4; ++i <=7;) {
    sub(/ *$/, x, $i)
    $i in map && $i = map[$i]
    }
  }
  42
  ' OFS=\| file2.txt file1.txt

Thanks for the reply. Can you please explain your code...

awk -F\| 'NR == FNR {                  # while reading the first non-empty input file
  map[$2] = $1                         # build an associative array named map indexed by $2, $1 as values 
  next                                 # skip the rest of the program 
  }
{
  for (i = 4; ++i <=7;) {              # while reading the rest of the input, for each field from 5 to 7 
    sub(/ *$/, x, $i)                  # trim any trailing spaces
    $i in map && $i = map[$i]          # if the column value is an index in the map array, set its value to map[$i]
    }
  }
  42                                   # print the (modified) records
  ' OFS=\| file2.txt file1.txt

Hi radoulov, 42 - is this a key word in awk?

regards,
Ahamed

No.

Each awk statement consists of a pattern with an associated action,
either the pattern or the action can be omitted, but not both.

pattern { action }

In this case - 42 - we have a single expression.
It matches when its value is nonzero (if a number) or non-null (if a string).
Basically, it could be any number different than 0 or any string different than "".

As far as the choice of 42 is concerned, see this :).

P.S. By the way, Aharon Robbins, one of the GNU awk authors and current maintainer, defines
this as absolutely miserable programming practice.

1 Like