Hi everyone,
I've been thinking and trying/changing all day long the below code, maybe some awk expert could help me to fix the for loop I've thought,
I think I'm very close to the correct output.
file1 is:
<boxes content="Grapes and Apples">
<box No.="Box MT. 53">
<quantity f="4">Grapes</quantity>
<quantity f="8">Apples</quantity>
</box>
<box No.="Box MJ 62">
<quantity f="7">Grapes</quantity>
<quantity f="12">Apples</quantity>
</box>
</boxes>
file2 is:
<some text...>
<some text...>
<f><v>Begin</v></f>
<f><v>Prod No</v></f>
<f><v>Serial</v></f>
<f><v>Grapes and Apples</v></f>
<f><v>Begin 1</v></f>
<f><v>Box MT. 53</v></f>
<f><v>XMT. 5563</v></f>
<f><v>Begin 2</v></f>
<f><v>Box MJ 62</v></f>
<f><v>JJKD. 772</v></f>
<f><v>Apples</v></f>
<f><v>Grapes</v></f>
</abc>
My code so far is:
#Arr1 #Array to store info of 1rst block, Don't pay attention to this array.
#Arr3 #Array for 1rst line for blocks 2 y 3 (stores unique strings in blue in file 1,
#Apples and Grapes). Apples and grapes appear in alphabetical order in file2
#Arr5 #Array for values of each block taken from file 1 in red.
awk 'BEGIN{ B = 66 }
FNR==NR{
if ($0 ~ "box No.=")
{Arr1[FNR]=gensub(/^[^"]+"|".+$/,"","g");asorti(Arr1,Arr2)}
else if ( $0 ~ "quantity f=" )
{Arr3[gensub(/.+">|<.+$/,"","g")];asorti(Arr3,Arr4)
Arr5[FNR]=gensub(/^[^"]+"|".+$/,"","g");asorti(Arr5,Arr6)
}
next
}
{
############### for loop to generate blocks #####################
for ( j=2;j<=length(Arr3)+1;j++ ) { #Loop to generate block 2 and 3, because of that j begins in 2.
if($0 ~ ">"Arr4[j-1]"<") {
{printf("<begin \"%d\" >\n\t<b ln=\"A%d\" t=\"s\"><v>%d</v></b>\n", j,j,FNR);} #print 1rst line of each block
for ( k=(j-1);k<=(j-2)+length(Arr5);k=k+length(Arr1) ) { #Loop to print rest of the values related to each fruit
if ( k < length(Arr5)/length(Arr1) ) {
printf("\t<b ln=\"%c%d\"><v>%d</v></b>\n", B, j, Arr6[k]); #Printing the value
B++
}
else {
printf("\t<b ln=\"%c%d\"><v>%d</v></b>\n</begin>", B, j, Arr6[k]); #Printing last line of each block
B=66 # B=66 because is the ASCII in decimal of letter B.
}
}
}
}
}' file1 file2
The for loop intends to generate the blocks 2, 3...N of the output (in the sample only blocks 2 y 3). The blocks 2 and 3 represents info from
uniques fruits in file1 and their respective values. Block 2 is for Apples and contains its values from file1 (8 and 12); Block 3 is for
Grapes and contains its values from file1 (4 and 7).
- In alphabetical order, Apples goes first than Grapes, then, block 2 is for Apples and block 3 for Grapes.
- For each fruit block, the fruit values must appear in same order that appear in file1, e.g for Apples 8 and 12 and not 12 and 8.
I'm getting this output:
<begin "2" >
<b ln="A2" t="s"><v>13</v></b>
<b ln="B2"><v>3</v></b>
<b ln="C2"><v>7</v></b>
</begin><begin "3" >
<b ln="A3" t="s"><v>14</v></b>
<b ln="B3"><v>4</v></b>
</begin> <b ln="B3"><v>8</v></b>
</begin>
and the correct output should be:
<begin ln="2" >
<c ln="A2" t="s"><v>13</v></b>
<c ln="B2"><v>8</v></b>
<c ln="C2"><v>12</v></b>
</begin>
<begin ln="3" >
<c ln="A3" t="s"><v>14</v></b>
<c ln="B3"><v>4</v></b>
<c ln="C3"><v>7</v></b>
</begin>
The first line for each block is line number from file2, e.g. Apples appears in line 13 in file2 and Grapes appear in line 14.
Maybe someone could fix my for loop, I'm stuck in the part to print in correct order the values related to each fruit block.
PS: I have another for loop that generates the first block (not shown), so it will be great if the solution could be added to the first loop.
Many thanks in advance.