Map new file to old file

Hellos,

Some columns of a file have undergone name changes from old to new format,
to run some of the scripts I need to revert back and rearrange these columns from the new to the old format.

oldfile has the column headers as following, data is not important

colA	colB	colC	colD	colE
10	11	12	13	15
14	14	14	14	16

newfile looks like the following

col1 	col2 	col3	col4	col5	col6
1	2	3	4	9	45
5	6	7	8	9	46

there is a names table which tells us which name changes to what

names

col5	colA
col4	colC
col2 	colD
col1	colB

I need to make the newfile look like the oldfile, by rearranging and renaming the columns, extra columns in the newfile appended to the end,ignoring the extra columns in the oldfile.

colA	colB	colC	colD	col3	col6
9	1	4	2	3	45
9	5	8	6	7	46

Here is my attempt, I`m struggling to rearrange the columns. Please assist.

awk '  FILENAME=="names" { n[$1]=$2;next }
       FILENAME=="oldfile"  && NR==1 { for (i=1;i<=NF;i++) o=$i ; next }
       FILENAME=="newfile" && NR==1 {  s=x; for (i=1;i<=NF;i++) { 
       						for ($i in n) {
       							$i=n[$i] }
       						s=s FS n[$i]
       						{
       				       print s
       				     }' names oldfile newfile
       				     

Quite a complex problem, here is my solution:

awk '
FNR==1 { file++ }
file==1 { new[$1]=$2; next }
file==2 {
  if(FNR==1) for(i=1;i<=NF;i++) old=$i
  next
}
file==3&&FNR==1 {
    for(i=1;i<=NF;i++)
        if($i in new) {
            check=h
            for(j=1;j in old;j++) if(old[j]==new[$i]) { pos[j]=i; h++}
            if(check==h) {
                print "Error: cannot update field \"" $i "\": field \"" new[$i] "\" not found in oldfile" > "/dev/stderr"
                exit 1
            }
        }
        for(i=1;i<=NF;i++)
        if($i in new) $i=new[$i]
        else pos[++h]=i
}
{
   out=$(pos[1])
   for(i=2;i<=NF;i++) out= out FS $(pos)
   print out
}' names oldfile newfile

Edit: Added code in red, this is just for safety and ensures if you list a field in "names" and it cannot be found in "oldfile" we report a problem and abort. Without this check you get a corrupt output file with no error reported.

1 Like