Script to change the file content using some conditions

Hello,

I would like to change the content of the file that has few blocks starts with 10 and ends with ";"

File1.txt

10 bonuses D
   20 MATCHED
   30 UPD COL
   40 (SOL=30) 
   20 NOT MATCHED
   30 INS COL
   40 (SOL=30)
;


10 bonuses D
   20 MATCHED
   30 UPD COL
   40 (SOL=20) 
   20 NOT MATCHED
   30 INS COL
   40 (SOL=50)
;

I've tried using awk to match the sets but could not able to get the output as below - each block has some lines starts with 40 that need to be merged with line starts with 20 with AND and then 40 line needs to be deleted.

output.txt

10 bonuses D
   20 MATCHED AND (SOL=30)
   30 UPD COL
   20 NOT MATCHED AND (SOL=30)
   30 INS COL
;


10 bonuses D
   20 MATCHED AND (SOL=20)
   30 UPD COL
   20 NOT MATCHED AND (SOL=50)
   30 INS COL
;

I'm trying it to get it achieved. Please provide your valuable suggestions or approches to get it reach the expected output. Thanks!

Are these sections really all formatted and separated like this? Or is it actually a lot messier?

The structure of the file is same and each section is separated by space - Yes much longer than the example here but separated by space.

Works for the data you showed:

BEGIN {
        RS=ORS="\n\n"
        FS=OFS="\n"
}

{
        A=B=0

        for(N=1; N<=NF; N++) {
                if(!A && $N ~ /^[ \t]*20[ \t]/) A=N
                if(!B && $N ~ /^[ \t]*40[ \t]/) B=N

                if(A && B)
                {
                        sub(/^[ \t]*40[ \t]+/, "", $B);
                        $A = $A " AND " $B

                        for(M=B; M<NF; M++) $M=$(M+1)
                        NF--;

                        A=B=0
                        N--
                }
        }
} 1
awk -f sect.awk data

10 bonuses D
   20 MATCHED AND (SOL=30)
   30 UPD COL
   20 NOT MATCHED AND (SOL=30)
   30 INS COL
;


10 bonuses D
   20 MATCHED AND (SOL=20)
   30 UPD COL
   20 NOT MATCHED AND (SOL=50)
   30 INS COL
;

$
1 Like

Another approach:

awk '{A[$1]=$0} $1==40{print A[20] " AND " $2 RS A[30]} $1!~/^[234]0$/' file

---
or

awk '{A=$0; $1=x} i==40{print A[20] " AND" $0 RS A[30]} i!~/^[234]0$/' file

---
sed approach:

sed '/^ *20 /{N; N; s/\(.*\)\(\n.*\)\n *40\(.*\)/\1 AND\3\2/;}' file

Whether this works, will depend on how much the sample differs from real life data...

1 Like

How about

tac file | awk '$1 == 40 {TMP = $2} $1 == 20 {print $0, "AND",  TMP} $1 !~ /^[24]0$/' | tac
10 bonuses D
   20 MATCHED AND (SOL=30)
   30 UPD COL
   20 NOT MATCHED AND (SOL=30)
   30 INS COL
;


10 bonuses D
   20 MATCHED AND (SOL=20)
   30 UPD COL
   20 NOT MATCHED AND (SOL=50)
   30 INS COL
;