Problem updated file with new entries

pitccorp01 · October 26, 2009, 12:37am

Hi,

I have a file, fileA, that consists of two fields. Field 1 has an old object name, and field 2 has a new object name. I would like to search and replace a master file, FileB, by substituting the old object name (field 1 in fileA) with the new object name ( field 2 in fileA).

FileA:

old_name1 new_name1
old_name2 new_name2
old_name3 new_name3

FileB:

Employee name: old_name1
Employee name: old_name2
Employee name: old_name3

#! /bin/sh -vx
for i in FileA
do
OLD=`echo $i | awk '{print $1}'`
NEW=`echo $i | awk '{print $2}'`

sed "s/$OLD/$NEW/g" FileB >> FileB.new
done

The above code does not evaluate the $NEW variable. Can someone please assist?

Thanks,
Anthony

sank · October 26, 2009, 1:11am

Looks like there is a problem in the for loop; why don't you change the for loop to the following:

while read i
do
 old=`echo $i | awk '{print $1}'`
 new=`echo $i | awk '{print $2}'`

done < FileA

pitccorp01 · October 26, 2009, 1:33am

Thanks for your assistance. After making the changes that you suggested, I am getting the correct substitution.

Best regards,
Anthony

Scrutinizer · October 26, 2009, 2:34am

Hi, instead of:

for i in fileA; do

for this to work you would need:

for i in $(cat fileA); do

or better, use a while loop and then you do not need the awks:

while read old new; do
  sed -n "s/$old$/$new/p" FileB
done < FileA > FileB.new

=======

=======
If your files are not too massive you could use a faster solution:

awk 'NR==FNR {A[$1]=$2;next} {$3=A[$3]} 1' FileA FileB > FileB.new

or use ksh which also has associative arrays:

#!/bin/ksh
typeset -A newname;
while read old new; do
 newname[$old]=$new
done < FileA

while read field1a field1b empl;do
  echo "$field1a $field1b ${newname[$empl]}"
done < FileB > FileB.new

There could be a problem with this setup if any of those employees have spaces in their names. In that case I think you may need to use a different field separator in the files instead of a space and use "" around that variable names.

pitccorp01 · October 26, 2009, 3:15am

Thanks again for the input. How can I use vi command inside the shell script (/bin/sh) to make the substitutions and save the updates to a new file? I am having problems with the format using sed.

Thanks,
Anthony

Scrutinizer · October 26, 2009, 5:44am

Did you try the sed I suggested?

sed -n "s/$old$/$new/p"

Otherwise you'll end up with a very long file.

---------- Post updated 10-26-09 at 01:44 AM ---------- Previous update was 10-25-09 at 11:29 PM ----------

Oops,
These solutions would only work if every name in FileB needs to get changed. If some of the names have to remain unchanged then we need a different solution.

pitccorp01 · October 26, 2009, 2:34pm

Actually, I need to keep the contents of the original file in tact changing only the sed substitutions.

Thanks,
Anthony

Scrutinizer · October 26, 2009, 5:38pm

OK in that case, if your sed has the -i option you could do something like:

#!/bin/sh
cp -p FileB FileB.new
while read old new; do
  sed -i "s/$old$/$new/" FileB.new
done < FileA

though a tad expensive, continually updating fileB.new, but I think less expensive than two read loops...

With awk it is more efficient and faster too.

awk 'NR==FNR {A[$1]=$2;next} A[$3]{$3=A[$3]} 1' FileA FileB > FileB.new

And alternatively with ksh it is efficient and fast too:

#!/bin/ksh
typeset -A newname;
while read old new; do
  newname[$old]=$new
done < FileA

while read field1a field1b empl; do
  if [[ -n ${newname[$empl]} ]]; then
    empl=${newname[$empl]}
  fi
  echo "$field1a $field1b $empl"
done < FileB > FileB.new

pitccorp01 · October 27, 2009, 1:26am

I like the awk approach. How can I use the awk statement for FileA that is 50 lines long?

Scrutinizer · October 27, 2009, 2:32am

That is a tiny file. You can use any method you like. Just try it out.

pitccorp01 · October 30, 2009, 4:13pm

Ok. I used the awk statement (nice!) but this statement only changes the first occurs of the old name to the new name. Any suggestions?

Thanks,
Anthony

Scrutinizer · October 30, 2009, 11:00pm

Hi pitccorp01, what is the format of fileB then? I thought it was:

Employee name: old_name1
Employee name: old_name2
Employee name: old_name3

In which case you only need one replacement since there is only one occurrence per line.

pitccorp01 · October 30, 2009, 11:26pm

My apologies! I gave you the condensed version of FileB. Here is the format of FileB in its entirety:

/------------------Employee name 1 ---------------------/
Employee name1:
Address:
City State:
Telephone Number:
Cellular Number:
Department:
Manager:

/------------------Employee name 2---------------------/
Employee name2:
Address:
City State:
Telephone Number:
Cellular Number:
Department:
Manager:

The above is repeat for the next employee

FileA contains which employees will be substituted for new employees.

Thanks,
Anthony

Scrutinizer · October 30, 2009, 11:55pm

But that is still only one field per line. I don't understand why that would not work or what would not work. Can you show me an anonymized sample of what it does to your input? Can you give me an anonymized sample of FileA as well?

pitccorp01 · October 31, 2009, 9:00am

the awk statement only updates the first line for example:

/------------------John Doe (new name)---------------------/
Employee name1: Tom Smith (old name)
Address:
City State:
Telephone Number:
Cellular Number:
Department:
Manager:

Scrutinizer · October 31, 2009, 10:00am

OK then both of your input files are different then originally specified and the names contain spaces. I wonder what you FileA looks like, since you can't use a space as a separator because the number of fields will vary (e.g. my real name is S. C. Rut in Izer, which is a very common name where I live). Instead you could use a colon or some other field separator, like so:

old_name1:new_name1
old_name3:new_name3
Tom Smith:John Doe

You can then use the separator in the script to read the old and the new name. The following should work:

cp -p FileB FileB.new
while IFS=: read -r old new; do
  sed -i "s/$old/$new/" FileB.new
done < FileA

edidataguy · October 31, 2009, 9:48pm

Is this a homework?
Because good programmer will not take this approach.
Usually you come up with an employee ID.
This will be a disaster if there were more than one employee with the same name.

Master file1:

Jack  Jim
Tom   Tim
Red   Rose
Steve Stan
Jack  John

Data file2:

Jack
Tom
Jack
Red
Steve
Jack

All the Jack's will be replaced with John (being the last entry in file1).
Jim will be nowhere in the picture.