Awk/sed to replace variable in file

Hi All

I have one file with multiple lines in it, each line has static text and some variable enclosed in <<filename>> as well. e.g. as below

123, <<file1.txt>> this is my name, I stay at <<city.txt>> Thanks for visiting
348384y,  this is my name <<fileabc.txt>>, I stay at near the mall of <<cityxyz.txt>> welcome

each variable is a file which has content as below

cat file1.txt
scott
thomas
cat city.txt
CHICAGO 
LA
TAT

I am looking for a code which can print the file line with variable value from each file as Cartesian Product as below

123, scott this is my name, I stay at CHICAGO Thanks for visiting
123, scott this is my name, I stay at LA Thanks for visiting
123, scott this is my name, I stay at TAT Thanks for visiting
123, thomas this is my name, I stay at CHICAGO Thanks for visiting
123, thomas this is my name, I stay at LA Thanks for visiting
123, thomas this is my name, I stay at TAT Thanks for visiting
similar way second line with value from their file name per variable

Is there any quicker/simpler way to do in awk/sed/perl or shell scripting?

Thanks
rel

Any attempts / ideas / thoughts from your side?

Hi Rudi

I have idea to do via shell scripting, read this file line by line, find the 1st variable in line and get the relevant fileName which has the value/records and print it as value replacement for full line in a new temp file. Then again read this temp file in loop, replace the second variable for each line and create final output file. though it is possible but it would be little lengthy and difficult if there are more than 2 such variables. thats why i am looking for any straight and simple way if possible

Hello reldb,

Could you please try following.

awk '
FILENAME=="file1.txt"{
  name[FNR]=$0
  next
}
FILENAME=="city.txt"{
  city[++count]=$0
  next
}
{
  val=$0
  for(i=1;i<=count;i++){
     sub(/<<file1.txt>>/,name[FNR])
     sub(/<<city.txt>>/,city)
     print
     $0=val
  }
}' file1.txt   city.txt   Input_file 

Output will be as follows.

123, scott this is my name, I stay at CHICAGO  Thanks for visiting
123, scott this is my name, I stay at LA Thanks for visiting
123, scott this is my name, I stay at TAT Thanks for visiting
348384y,  this is my name thomas, I stay at near the mall of CHICAGO  welcome
348384y,  this is my name thomas, I stay at near the mall of LA welcome
348384y,  this is my name thomas, I stay at near the mall of TAT welcome

Thanks,
R. Singh

Try also

awk '
BEGIN   {PAT = "<<[^>]*>>"
        }

        {match ($0, PAT)
         NXT = RSTART + RLENGTH
         while (0 < getline T1[++CNT1] < (substr ($0, RSTART+2, RLENGTH-4)) );
         match (substr ($0, NXT), PAT)
         while (0 < getline T2[++CNT2] < (substr ($0, NXT+RSTART+1, RLENGTH-4)) ) ;
         for (i=1; i<CNT1; i++)
          for (j=1; j<CNT2; j++)        {TMP = $0
                                         sub (PAT, T1, TMP)
                                         sub (PAT, T2[j], TMP)
                                         print TMP
                                        }
         CNT1 = CNT2 = 0
        }
' file
123, scott this is my name, I stay at CHICAGO  Thanks for visiting
123, scott this is my name, I stay at LA Thanks for visiting
123, scott this is my name, I stay at TAT Thanks for visiting
123, thomas this is my name, I stay at CHICAGO  Thanks for visiting
123, thomas this is my name, I stay at LA Thanks for visiting
123, thomas this is my name, I stay at TAT Thanks for visiting
348384y,  this is my name scott, I stay at near the mall of CHICAGO  welcome
348384y,  this is my name scott, I stay at near the mall of LA welcome
348384y,  this is my name scott, I stay at near the mall of TAT welcome
348384y,  this is my name thomas, I stay at near the mall of CHICAGO  welcome
348384y,  this is my name thomas, I stay at near the mall of LA welcome
348384y,  this is my name thomas, I stay at near the mall of TAT welcome
 

Be aware that there's no error handling yet, e.g. if one of the filenames in a line is not found, or the are no two patterns in a line.

awk -F'<<|>>' '{while((getline d<$2) > 0){while((getline b<$4) > 0) print $1 d $3 b $4; close($4)}}'
2 Likes

Hi Rudi
Thanks for sharing the detail
I tried your solution, and it worked for lines where variable exists exactly 2 times in that particular line.
There are lines in main file where no variable exits or only one variable exits instead of 2 times in the line, then it is giving error.
any suggestion to handle that

Thanks a lot
Rel

A little clumsy as you have to take care for every single special case - may benefit from some polishing:

awk '
BEGIN   {PAT = "<<[^>]*>>"
        }

        {match ($0, PAT)
         NXT = RSTART + RLENGTH
         FN = substr ($0, RSTART+2, RLENGTH-4)
         while (0 < getline T1[++CNT1] < FN );
         close (FN)
         match (substr ($0, NXT), PAT)
         FN = substr ($0, NXT+RSTART+1, RLENGTH-4)
         while (0 < getline T2[++CNT2] < FN );
         close (FN)

         if (CNT1==1)   print
         else           for (i=1; i<CNT1; i++)   if (CNT2>1)    for (j=1; j<CNT2; j++)  {TMP = $0
                                                                                         sub (PAT, T1, TMP)
                                                                                         sub (PAT, T2[j], TMP)
                                                                                         print TMP
                                                                                        }
                                                 else                                   {TMP = $0
                                                                                         sub (PAT, T1, TMP)
                                                                                         print TMP
                                                                                        }
         CNT1 = CNT2 = 0
        }
' file

@nezabudka: nice, witty solution! But - shouldn't it be print $1 d $3 b $5 in lieu of what is in your post?

1 Like

Thanks All
i was able to write my code in shell script via loop and it is working fine/expected.
will share that code