I have one file with multiple lines in it, each line has static text and some variable enclosed in <<filename>> as well. e.g. as below
123, <<file1.txt>> this is my name, I stay at <<city.txt>> Thanks for visiting
348384y, this is my name <<fileabc.txt>>, I stay at near the mall of <<cityxyz.txt>> welcome
each variable is a file which has content as below
cat file1.txt
scott
thomas
cat city.txt
CHICAGO
LA
TAT
I am looking for a code which can print the file line with variable value from each file as Cartesian Product as below
123, scott this is my name, I stay at CHICAGO Thanks for visiting
123, scott this is my name, I stay at LA Thanks for visiting
123, scott this is my name, I stay at TAT Thanks for visiting
123, thomas this is my name, I stay at CHICAGO Thanks for visiting
123, thomas this is my name, I stay at LA Thanks for visiting
123, thomas this is my name, I stay at TAT Thanks for visiting
similar way second line with value from their file name per variable
Is there any quicker/simpler way to do in awk/sed/perl or shell scripting?
I have idea to do via shell scripting, read this file line by line, find the 1st variable in line and get the relevant fileName which has the value/records and print it as value replacement for full line in a new temp file. Then again read this temp file in loop, replace the second variable for each line and create final output file. though it is possible but it would be little lengthy and difficult if there are more than 2 such variables. thats why i am looking for any straight and simple way if possible
123, scott this is my name, I stay at CHICAGO Thanks for visiting
123, scott this is my name, I stay at LA Thanks for visiting
123, scott this is my name, I stay at TAT Thanks for visiting
348384y, this is my name thomas, I stay at near the mall of CHICAGO welcome
348384y, this is my name thomas, I stay at near the mall of LA welcome
348384y, this is my name thomas, I stay at near the mall of TAT welcome
awk '
BEGIN {PAT = "<<[^>]*>>"
}
{match ($0, PAT)
NXT = RSTART + RLENGTH
while (0 < getline T1[++CNT1] < (substr ($0, RSTART+2, RLENGTH-4)) );
match (substr ($0, NXT), PAT)
while (0 < getline T2[++CNT2] < (substr ($0, NXT+RSTART+1, RLENGTH-4)) ) ;
for (i=1; i<CNT1; i++)
for (j=1; j<CNT2; j++) {TMP = $0
sub (PAT, T1, TMP)
sub (PAT, T2[j], TMP)
print TMP
}
CNT1 = CNT2 = 0
}
' file
123, scott this is my name, I stay at CHICAGO Thanks for visiting
123, scott this is my name, I stay at LA Thanks for visiting
123, scott this is my name, I stay at TAT Thanks for visiting
123, thomas this is my name, I stay at CHICAGO Thanks for visiting
123, thomas this is my name, I stay at LA Thanks for visiting
123, thomas this is my name, I stay at TAT Thanks for visiting
348384y, this is my name scott, I stay at near the mall of CHICAGO welcome
348384y, this is my name scott, I stay at near the mall of LA welcome
348384y, this is my name scott, I stay at near the mall of TAT welcome
348384y, this is my name thomas, I stay at near the mall of CHICAGO welcome
348384y, this is my name thomas, I stay at near the mall of LA welcome
348384y, this is my name thomas, I stay at near the mall of TAT welcome
Be aware that there's no error handling yet, e.g. if one of the filenames in a line is not found, or the are no two patterns in a line.
Hi Rudi
Thanks for sharing the detail
I tried your solution, and it worked for lines where variable exists exactly 2 times in that particular line.
There are lines in main file where no variable exits or only one variable exits instead of 2 times in the line, then it is giving error.
any suggestion to handle that