Merging two lines into one (awk)

Hi,

I am attempting to merge the following lines which run over two lines using awk.

INITIAL OUTPUT

2019 Sep 28 10:47:24.695 hkaet9612 last message repeated 1 time
2019 Sep 28 10:47:24.695 hkaet9612 %ETHPORT-5-IF_DOWN_INTERFACE_REMOVED: Interfa
ce Ethernet1/45 is down (Interface removed)
2019 Sep 28 10:47:24.699 hkaet9612 last message repeated 1 time
2019 Sep 28 10:47:24.699 hkaet9612 %ETHPORT-5-IF_DOWN_INTERFACE_REMOVED: Interfa
2019 Sep 28 10:47:24.699 hkaet9612 %ETHPORT-5-IF_DOWN_INTERFACE_REMOVED: Interfa
ce Ethernet1/46 is down (Interface removed)
2019 Sep 28 10:47:24.702 hkaet9612 last message repeated 1 time

EXPECTED OUTPUT

2019 Sep 28 10:47:24.695 hkaet9612 last message repeated 1 time
2019 Sep 28 10:47:24.695 hkaet9612 %ETHPORT-5-IF_DOWN_INTERFACE_REMOVED: Interface Ethernet1/45 is down (Interface removed)
2019 Sep 28 10:47:24.699 hkaet9612 last message repeated 1 time
2019 Sep 28 10:47:24.699 hkaet9612 %ETHPORT-5-IF_DOWN_INTERFACE_REMOVED: Interfa
2019 Sep 28 10:47:24.699 hkaet9612 %ETHPORT-5-IF_DOWN_INTERFACE_REMOVED: Interface Ethernet1/46 is down (Interface removed)
2019 Sep 28 10:47:24.702 hkaet9612 last message repeated 1 time

Attempts:

$ awk 'length==80{ORS=$0?"":RS}1' unwrap.txt
2019 Sep 28 10:47:24.695 hkaet9612 last message repeated 1 time
2019 Sep 28 10:47:24.695 hkaet9612 %ETHPORT-5-IF_DOWN_INTERFACE_REMOVED: Interface Ethernet1/45 is down (Interface removed)2019 Sep 28 10:47:24.699 hkaet9612 last message repeated 1 time2019 Sep 28 10:47:24.699 hkaet9612 %ETHPORT-5-IF_DOWN_INTERFACE_REMOVED: Interface Ethernet1/46 is down (Interface removed)2019 Sep 28 10:47:24.702 hkaet9612 last message repeated 1 time

$ awk 'length==80{ORS=/$0/?"":RS}1' unwrap.txt
2019 Sep 28 10:47:24.695 hkaet9612 last message repeated 1 time
2019 Sep 28 10:47:24.695 hkaet9612 %ETHPORT-5-IF_DOWN_INTERFACE_REMOVED: Interfa
ce Ethernet1/45 is down (Interface removed)
2019 Sep 28 10:47:24.699 hkaet9612 last message repeated 1 time
2019 Sep 28 10:47:24.699 hkaet9612 %ETHPORT-5-IF_DOWN_INTERFACE_REMOVED: Interfa
ce Ethernet1/46 is down (Interface removed)
2019 Sep 28 10:47:24.702 hkaet9612 last message repeated 1 time

Can I please get some assistance around why this is not working?

Thanks.

Hello sand1234,

Could you please try following.

awk '{printf("%s%s",$0~/^[0-9]+/ && FNR>1?ORS:FNR==1?"":OFS,$0)}'   Input_file

Output will be as follows.

2019 Sep 28 10:47:24.695 hkaet9612 last message repeated 1 time
2019 Sep 28 10:47:24.695 hkaet9612 %ETHPORT-5-IF_DOWN_INTERFACE_REMOVED: Interfa ce Ethernet1/45 is down (Interface removed)
2019 Sep 28 10:47:24.699 hkaet9612 last message repeated 1 time
2019 Sep 28 10:47:24.699 hkaet9612 %ETHPORT-5-IF_DOWN_INTERFACE_REMOVED: Interfa ce Ethernet1/46 is down (Interface removed)
2019 Sep 28 10:47:24.702 hkaet9612 last message repeated 1 time

Thanks,
R. Singh

1 Like

Hi RavinderSingh13,

The code works, although the last line does not have '\n'

root@localhost# awk '{printf("%s%s",$0~/^[0-9]+/ && FNR>1?ORS:FNR==1?"":OFS,$0)}' unwrap.txt
2019 Sep 28 10:47:24.695 hkaet9612 last message repeated 1 time
2019 Sep 28 10:47:24.695 hkaet9612 %ETHPORT-5-IF_DOWN_INTERFACE_REMOVED: Interface Ethernet1/45 is down (Interface removed)
2019 Sep 28 10:47:24.699 hkaet9612 last message repeated 1 time
2019 Sep 28 10:47:24.699 hkaet9612 %ETHPORT-5-IF_DOWN_INTERFACE_REMOVED: Interface Ethernet1/46 is down (Interface removed)
2019 Sep 28 10:47:24.702 hkaet9612 last message repeated 1 time root@localhost#

Can you please explain why my code didn't work, and the logic from your code?

Thanks.

Hello sand1234,

When I ran code for your samples it worked fine for me, if still it is giving no new line at last edit it like:

awk '{printf("%s%s",$0~/^[0-9]+/ && FNR>1?ORS:FNR==1?"":OFS,$0)} END{print ""}'   Input_file

Explanation of code:
$0~/^[0-9]+/ && FNR>1 If line starts with digit and NOT 1st line then print ORS means a new line.
:FNR==1?"":OFS,$0 Else line is 1st line print nothing or else print space OFS.

Thanks,
R. Singh

1 Like

Try also (your code simplified):

awk 'length==80 {ORS=""} 1; {ORS=RS}' file

or

awk '{ORS=length==80?"":RS} 1' file

Your first attempt doesn't work as expected as ORS is never reset to RS , as you'll never enter the statement with an empty $0 .
Your second will never set ORS to "" as /.../ denotes an RE string, and the string "$0" is not matched within the line $0 .

1 Like

Hi RavinderSingh13,

Thanks for the explanation and solution.

However in this case we need to match on lines which have 80 character length.

I changed your solution to the one below.

However, I just realized that although the code works, the logic is incorrect. If len(line)==80, add newline, else add "". This in fact is the opposite of what we are trying to achieve!

$ awk '{printf("%s%s",$0~length==80?ORS:"",$0)} END {print ""}' unwrap2.txt

2019 Sep 28 10:47:24.695 hkaet9612 last message repeated 1 time
2019 Sep 28 10:47:24.695 hkaet9612 %ETHPORT-5-IF_DOWN_INTERFACE_REMOVED: Interface Ethernet1/45 is down (Interface removed)
2019 Sep 28 10:47:24.699 hkaet9612 last message repeated 1 time
2019 Sep 28 10:47:24.699 hkaet9612 %ETHPORT-5-IF_DOWN_INTERFACE_REMOVED: Interfa
2019 Sep 28 10:47:24.699 hkaet9612 %ETHPORT-5-IF_DOWN_INTERFACE_REMOVED: Interface Ethernet1/46 is down (Interface removed)
2019 Sep 28 10:47:24.702 hkaet9612 last message repeated 1 time

Further to this point, the matching criteria does not match the 80 character lines we want, and instead matches everything.

$ awk '$0~length==80' unwrap2.txt
2019 Sep 28 10:47:24.695 hkaet9612 last message repeated 1 time
2019 Sep 28 10:47:24.695 hkaet9612 %ETHPORT-5-IF_DOWN_INTERFACE_REMOVED: Interfa
2019 Sep 28 10:47:24.699 hkaet9612 last message repeated 1 time
2019 Sep 28 10:47:24.699 hkaet9612 %ETHPORT-5-IF_DOWN_INTERFACE_REMOVED: Interfa
2019 Sep 28 10:47:24.699 hkaet9612 %ETHPORT-5-IF_DOWN_INTERFACE_REMOVED: Interfa
2019 Sep 28 10:47:24.702 hkaet9612 last message repeated 1 time

How can I modify the logic of my code to correctly match on 80 char lines and achieve the desired result?

Thanks.

--- Post updated at 12:42 PM ---

Hi RudiC,

Thanks for the explanation. Both of your solutions work well.

I've just updated the sample file, I would like the solution to cater for lines which are meant to be 80 char in length, and do not need to be condensed.

Interestingly enough, the following seems to work, although the logic doesn't make much sense (as explained in my previous post). Perhaps you can shed some light on this?

awk '{printf("%s%s",$0~length==80?ORS:"",$0)} END {print ""}' unwrap2.txt

2019 Sep 28 10:47:24.695 hkaet9612 last message repeated 1 time
2019 Sep 28 10:47:24.695 hkaet9612 %ETHPORT-5-IF_DOWN_INTERFACE_REMOVED: Interface Ethernet1/45 is down (Interface removed)
2019 Sep 28 10:47:24.699 hkaet9612 last message repeated 1 time
2019 Sep 28 10:47:24.699 hkaet9612 %ETHPORT-5-IF_DOWN_INTERFACE_REMOVED: Interfa
2019 Sep 28 10:47:24.699 hkaet9612 %ETHPORT-5-IF_DOWN_INTERFACE_REMOVED: Interface Ethernet1/46 is down (Interface removed)
2019 Sep 28 10:47:24.702 hkaet9612 last message repeated 1 time

Thanks.

[quote=sand1234;303039776]
Hi RavinderSingh13,
Thanks for the explanation and solution.
However in this case we need to match on lines which have 80 character length.
I changed your solution to the one below.
However, I just realized that although the code works, the logic is incorrect. If len(line)==80, add newline, else add "". This in fact is the opposite of what we are trying to achieve!

$ awk '{printf("%s%s",$0~length==80?ORS:"",$0)} END {print ""}' unwrap2.txt
.................................................................................

Hello sand1234,

Could you please try following(not tested though).

awk '{printf("%s%s",length($0)==80?ORS:"",$0)} END {print ""}' unwrap2.txt

Thanks,
R. Singh

1 Like

Hi RavinderSingh13,

Nice find, the following matches the correct lines.

$ awk 'length($0)==80' unwrap2.txt
2019 Sep 28 10:47:24.695 hkaet9612 %ETHPORT-5-IF_DOWN_INTERFACE_REMOVED: Interfa
2019 Sep 28 10:47:24.699 hkaet9612 %ETHPORT-5-IF_DOWN_INTERFACE_REMOVED: Interfa
2019 Sep 28 10:47:24.699 hkaet9612 %ETHPORT-5-IF_DOWN_INTERFACE_REMOVED: Interfa

And the full statement works correctly.

awk '{printf("%s%s",length($0)==80?ORS:"",$0)} END {print ""}' unwrap2.txt
2019 Sep 28 10:47:24.695 hkaet9612 last message repeated 1 time
2019 Sep 28 10:47:24.695 hkaet9612 %ETHPORT-5-IF_DOWN_INTERFACE_REMOVED: Interface Ethernet1/45 is down (Interface removed)2019 Sep 28 10:47:24.699 hkaet9612 last message repeated 1 time
2019 Sep 28 10:47:24.699 hkaet9612 %ETHPORT-5-IF_DOWN_INTERFACE_REMOVED: Interfa
2019 Sep 28 10:47:24.699 hkaet9612 %ETHPORT-5-IF_DOWN_INTERFACE_REMOVED: Interface Ethernet1/46 is down (Interface removed)2019 Sep 28 10:47:24.702 hkaet9612 last message repeated 1 time

Can you please clarify the logic?

From what I understand, the following statement reads as follows:

length($0)==80?ORS:"",$0
  • If length of string = 80 chars, add ORS (newline), else add ""

In this case, it should read - if length of str=80, add "", else add newline.

Thanks.

Hello sand1234,

Here is the explanation of it:

As you could see there are 2 %s written in printf which are connected to following.

%s -----> length($0)==80?ORS:""
%s -----> $0

1st one is having condition check if length of current line is 80 then print a new line or print nothing.
2nd one is simply printing $0, since we have already set either we need to print new line for this line or not so no need for anything else here.

Thanks,
R. Singh

2 Likes

Hi RavinderSingh13,

That part is clear for me.

The part which is confusing is the logic after ?

I believe it should be as per RudiC logic.

Your logic

awk '{printf("%s%s",length($0)==80?ORS:"",$0)} END {print ""}' unwrap2.txt

RudiC logic

awk '{ORS=length==80?"":RS} 1' file
awk 'length==80 {ORS=""} 1; {ORS=RS}' file

Thanks.

Hello sand1234,

Following is the complete explanation on same.

awk '                      ##Starting awk command here.
{
  ORS=length==80?"":RS     ##Setting ORS by checking condition if length of current line is 80 characters then set to NULL or set it as new line. 
                           ##Point to be noted here ORS and RS both have new line values by default from starting so setting them as per situation to either we want to print new line in output or not.
}
1
'  Input_file

Thanks,
R. Singh

2 Likes