Add semicolon to the end of each line when there is a dash on the next line

cokedude · July 1, 2024, 3:52pm

I have a lot of lines like this. I would like add semicolon to the end of each line when there is a dash on the next line.

select * from a WHERE b = 123AND c IN (7, 8)
--------------------------------------------------------------------------------
select * from dba_objects
where object_type = 'VIEW'
order by created desc, last_ddl_time desc
--------------------------------------------------------------------------------

Like this.

select * from a WHERE b = 123 AND c IN (7, 8);
--------------------------------------------------------------------------------
select * from dba_objects
where object_type = 'VIEW'
order by created desc, last_ddl_time desc;
--------------------------------------------------------------------------------

vgersh99 · July 1, 2024, 4:27pm

awk '/^--/{prev = prev ";"} {print prev} {prev = $0} END {print}' myInput.txt

select * from a WHERE b = 123AND c IN (7, 8);
--------------------------------------------------------------------------------
select * from dba_objects
where object_type = 'VIEW'
order by created desc, last_ddl_time desc;
--------------------------------------------------------------------------------

cokedude · July 1, 2024, 6:55pm

That worked perfectly :). Thank you.

Is awk less picky about hyphens then sed? While trying to find a solution I found this. I figured that would work after I made the adjustments for the hyphen.

First I tried this.

sed 'N;s/\n - /;&/;P;D' file
sed 'N;s/\n -- /;&/;P;D' file

Then remembered I needed a backslash.

sed 'N;s/\n \-- /;&/;P;D' file
sed 'N;s/\n \- /;&/;P;D' file

Then tried to use hex thinking I was doing something wrong with the back spaces.

sed 'N;s/\n \x2D /;&/;P;D' file

Then remembered I had extra spaces.

sed 'N;s/\n\x2D/;&/;P;D' file

Then tried again without hex. This finally worked. Is there a reason why it didn't work with hex.

sed 'N;s/\n\-/;&/;P;D' file

MadeInGermany · July 1, 2024, 7:00pm

IMHO $0 and NF may be undefined according to Posix. Safe is
END {print prev}

A sed version that adds the semicolon only if there isn't one yet:

sed '$!N; /[^;]\n *---/s/\n/;&/; P; D' myInput.txt

And (seeing the previous post) allows leading space.

A hyphen is not special in RE. Do not escape it.

Hex is not recognized in sed.

cokedude · July 1, 2024, 7:31pm

I do not understand what you are saying to change vgersh99 awk version from and to.

Can you explain the difference between your sed version and my sed version please?

Sed ran both of these with no issues. Why didn't it complain I did something wrong?

sed 'N;s/\n-/;&/;P;D' file
sed 'N;s/\n\-/;&/;P;D' file

vgersh99 · July 1, 2024, 7:58pm

@cokedude , what @MadeInGermany is saying is that $0 and NF vars may not be defined in the END block of awk according to the POSIX standard. And therefore, their usage in the END block is not defined as far as POSIX is concerned.
Hence, alternative implementation may need to be put in place - which @MadeInGermany has provided.
Thanks @MadeInGermany !

MadeInGermany · July 1, 2024, 8:05pm

Your version does not allow a leading space and adds a semicolon even if there were already a semicolon.
Further, it does not correctly work in a Unix sed: a last dashed line won't be printed, unless you use the $!N work-around. If it should only work in GNU sed then you can use N

Yes, \- is treated as a simple -
Like \a is treated as a simple a
But \b and \n have a special meaning, unlike simple b and n
What is not special today, can become special in a newer sed version. So keep simple things simple!

cokedude · July 1, 2024, 10:29pm

I am using Sunos. I thought that was considered unix?

: uname -a
SunOS ah5719006ub002 5.10 Generic_150400-64 sun4u sparc SUNW,SPARC-Enterprise

What does $!N do?

Can you explain what everything does please?

MadeInGermany · July 2, 2024, 6:03am

N command: append the next line to the input buffer with a \n (newline) in between.
$ address: the last line
! NOT
$!N if not the last line then do the N command.

/[^;]\n *---/ address: if the input buffer matches the RE (a character that is not a semicolon followed by a newline followed by any space characters followed by three dashes), then do the
s/// substitution command.
s/\n/;&/ substitute a \n (newline) by a semicolon and the whole match (the newline).

P command: print up to a \n

D command: delete up to a \n, start next cycle (that will add a new line to the input buffer)

cokedude · July 2, 2024, 6:33pm

Thank you :).