Exact Search and Replace using AWK

Hello.
I have written the following script to search and replace from one file into another.

#awk script to search and replace from file a in file b
NR == FNR { A[$1]=$2; next }
  { for( a in A ) sub(a, A[a])}1 file2 file1

While the function works pretty well, I want
a. The word in File 2 to mantained and also the replacement to be provided.
b. If there is a residue i.e. a word is not found in the master file (SnR) file, it should be listed in a file called residue.
How can I do this. I am new to awk and have reached this far in awk but tips to handle the issue would help
Sample files:
File1 (SnR file)

john=mary
pup=dog
cat=kitten

File2 (File for replacement)

john
pup
cat
jumbo

OUTPUT

=mary
=dog
=kitten

Expected output:

john=mary
pup=dog
cat=kitten

The residue "jumbo" could either be flagged as residue with a convenient marker say ! or stored in a separate file.
Many thanks in advance.

This should do it (With ! marker for residue)

awk 'NR==FNR{A[$1]=$2;next}{if($0 in A) print $0"="A[$0]; else print "!"$0}' FS="=" SnR replacement

Or this to put residue in file "residue"

awk 'NR==FNR{A[$1]=$2;next}{if($0 in A) print $0"="A[$0]; else print $0 >> "residue" }' FS="=" SnR replacement
1 Like

Dear Chubler XL,
The solution works but partially.
While the search and replace works wonderfully, the residue does not work.
I am posting below the output of both solutions:
SOLUTION1 (WITH RESIDUE IN FILE)

!john=mary
john=mary
!pup=dog
pup=dog
!cat=kitten
cat=kitten
!
!

Obviously the residue and postive mappers are jumbled up.

The second solution you proposed (with residue in a separate file) gives correct mapping

john=mary
pup=dog
cat=kitten

But the residual file is a problem:

john=mary
pup=dog
cat=kitten

john=mary
pup=dog
cat=kitten

Many thanks for your help,

Cant see how this could not work - going fine for me here.

Are your sure your version of file2 (replacement) dosn't look like this:

john=mary
john
pup=dog
pup
cat=kitten
cat
 
 

Also note: residual file is appended to each time so if you don't want to keep old (previous) residuals you should remove the file first.

Sorry for the delay in answering, but we had a broadband outage. I had no such luck. I copied the two files and renamed them master and slave and I still continue to get the same outputs which I posted.
All I should have got was 3 replacements with "jumbo" as the residue.
I am really perplexed. Is it once more because I work in DOS (blame it on DOS).
Many thanks all the same for all your help.

Perhaps it's the FS="=" try this:

awk -F= 'NR==FNR{A[$1]=$2;next}{if($0 in A) print $0"="A[$0]; else print "!"$0}' SnR replacement
1 Like

Dear Chubler_XL
I tried the syntax you gave and ran it with all the versions I have of Awk but they all gave the same result:
In despair, I even tried reversing the order, here is what I get:

C:\Users\XP-HOME\Desktop>awk -f snr.gk master slave
!john
john
!pup
pup
!cat
cat
!jumbo
jumbo

C:\Users\XP-HOME\Desktop>awk -f snr.gk slave master
!john=mary
john=mary
!pup=dog
pup=dog
!cat=kitten
cat=kitten

The test files are the same as given in my request:
MASTER:
john=mary
pup=dog
cat=kitten
SLAVE:
john
pup
cat
jumbo
I have even as a last resort, while writing this sorted both master and slave, the output is consistently wrong.
I am perplexed, since the syntax is absolutely correct :
Open the two files.
If match found spew it out.
What does not match flag as residue and show on screen or alternatively (as was in the earlier script) store in file.
One of the mysteries of Awk which remains insoluble.
Many thanks once again for all your kind help

Looks like your getting both the true and false conditions perhaps put { } around the conditions:

awk -F= 'NR==FNR{A[$1]=$2;next}{if($0 in A) { print $0"="A[$0] } else { print "!"$0}}' SnR replacement

Hello,
No luck as you can see below.I tried both

C:\Users\XP-HOME\Desktop>awk -f snr.gk master slave
!cat
cat
!john
john
!jumbo
jumbo
!pup
pup

C:\Users\XP-HOME\Desktop>awk -f snr.gk master slave
!cat
cat
!john
john
!jumbo
jumbo
!pup
pup
I suppose it is because of awk limitations under windows. Incidentally different "awk" versions for windows give different results. I will try and download the latest awk and get back to you,

Many thanks once more for your interest