sed a multiple line pattern that has newlines followed by text and r

Here is the text that I was to run sed on. In this text I want to insert a semi colon ';' before 'select a13.STORE_TYPE STORE_TYPE,' and after 'from ZZMR00 pa11'

Input text:

insert into ZZMQ01
select  pa11.STATE_NBR  STATE_NBR,
        pa11.STORE_TYPE  STORE_TYPE,
        pa11.WEEK_ID  WEEK_ID
from    ZZMR00  pa11



select  a13.STORE_TYPE  STORE_TYPE,
        max(a16.TYPE_DESC)  TYPE_DESC,
        a14.WEEK_ID  WEEK_ID,
        max(a17.WEEK_DESC)  WEEK_DESC,
        sum(a11.CLE_SLS_QTY)  WJXBFS1
from    STORE_DIVISION  a11
        join    LOOKUP_STORE    a12
          on    (a11.STORE_NBR = a12.STORE_NBR)
        join    RELATE_STORE_TYPE       a13
          on    (a11.STORE_NBR = a13.STORE_NBR)
        join    LOOKUP_DAY      a14
          on    (a11.CUR_TRN_DT = a14.CUR_TRN_DT)
        join    ZZMQ01  pa15
          on    (a12.STATE_NBR = pa15.STATE_NBR and
        a13.STORE_TYPE = pa15.STORE_TYPE and
        a14.WEEK_ID = pa15.WEEK_ID)
        join    LOOKUP_TYPE     a16
          on    (a13.STORE_TYPE = a16.STORE_TYPE)
        join    LOOKUP_WEEK     a17
          on    (a14.WEEK_ID = a17.WEEK_ID)
where   (a13.STORE_TYPE in (1, 2)
 and a14.MONTH_ID = 199411)
group by        a13.STORE_TYPE,
        a14.WEEK_ID

I want the sed to be generic like insert a semicolon if a 'select' is preceded by newlines.

The output will be

insert into ZZMQ01
select  pa11.STATE_NBR  STATE_NBR,
        pa11.STORE_TYPE  STORE_TYPE,
        pa11.WEEK_ID  WEEK_ID
from    ZZMR00  pa11

;  # Notice the semi colon that I want to insert here. This doesn't exist in the input text.

select  a13.STORE_TYPE  STORE_TYPE,
        max(a16.TYPE_DESC)  TYPE_DESC,
        a14.WEEK_ID  WEEK_ID,
        max(a17.WEEK_DESC)  WEEK_DESC,
        sum(a11.CLE_SLS_QTY)  WJXBFS1
from    STORE_DIVISION  a11
        join    LOOKUP_STORE    a12
          on    (a11.STORE_NBR = a12.STORE_NBR)
        join    RELATE_STORE_TYPE       a13
          on    (a11.STORE_NBR = a13.STORE_NBR)
        join    LOOKUP_DAY      a14
          on    (a11.CUR_TRN_DT = a14.CUR_TRN_DT)
        join    ZZMQ01  pa15
          on    (a12.STATE_NBR = pa15.STATE_NBR and
        a13.STORE_TYPE = pa15.STORE_TYPE and
        a14.WEEK_ID = pa15.WEEK_ID)
        join    LOOKUP_TYPE     a16
          on    (a13.STORE_TYPE = a16.STORE_TYPE)
        join    LOOKUP_WEEK     a17
          on    (a14.WEEK_ID = a17.WEEK_ID)
where   (a13.STORE_TYPE in (1, 2)
 and a14.MONTH_ID = 199411)
group by        a13.STORE_TYPE,
        a14.WEEK_ID

Insert a line with a semicolon before each select statement

sed '/^select/i\
;' input

@ MadeInGermany: That wouldn't work to satisfaction:

insert into ZZMQ01
;
select  pa11.STATE_NBR  STATE_NBR,

Try

awk '
/select/ &&
 NL && !SC      {print ";"}
                {NL=(NF==0)
                }
! NL            {SC=gsub (/;[   ]*$/, "&")
                }
1
' file
1 Like

Another option:

awk '$1=="select"{print ";"}1' RS= ORS='\n\n'  file

@ Scrutinizer: that would insert a semicolon even if the former statement were terminated with one.

Hi RudiC,

I responded to this specification by the OP

So yes, but I interpreted it to be desired behavior..

You are absolutely right. I was reading too much into that spec. So, there may be two consecutive semicolons - no idea what the SQL interpreter will do with it. My SQL days are long gone...

You are right too, it may be better to cater for such a case even if it is not part of the specification.

It could be mitigated in a way similar to what is built into your suggestion:

awk '$1=="select"{if(p)print ";"} {p=$NF!~/;$/}1' RS= ORS='\n\n' file

Thank you so much Rudic and Scrutinizer. Both your solutions worked for me. What Rudic said is right if the previous line terminated with 1 then it would replace that with semi colon. I hope that I don't have such a condition in the rest of my files where I want to do this search and replace.
My goal is that I want to separate each SQL statement in a file by a semi colon. In my SQL I don't have appropriate semicolons separation for all the SQL statements. My end goal is to run this SQL file via Teradata Bteq utility successfully.

This sed script prints the semicolon before the select if the preceding line is empty (there are two consecutive newlines)

sed '
/^select/{
x
/^$/i\
;
x
}
h
' file