Combining lines in one line

Hi
below is the input file snippet.
here i want that all the line which is coming after 1 shoud be in one line.
so for exanple if after 1 there is two lines which is starting with 2 should be combine in one line.
input file content

1,8091012,BATCH_1430903_01,21,T,2,808738,,,,21121:87:01,
2,A,79020:04:25,IC,,
2,D,89014:03:40,u,7239021674,
1,5390021,BATCH_1330903_02,21,T,2,306738,,,,21121:15:01,
2,A,6190:05:09,IC,,
1,9650123,BATCH_1110903_02,21,T,3,904428,,,,21121:34:11,
2,A,9270:00:22,IC,,
1,9992013,BATCH_1000903_06,21,T,7,50128,,,,21121:99:15,
2,A,7790:05:24,IC,,

Required O/P

1,8091012,BATCH_1430903_01,21,T,2,808731,,,,21121:87:01,|2,A,79020:04:25,IC,,|2,D,89014:03:40,u,7239021674,
1,5390021,BATCH_1330903_02,21,T,2,306730,,,,21121:15:01,|2,A,6190:05:09,IC,,
1,9650123,BATCH_1110903_02,21,T,3,904429,,,,21121:34:11,|2,A,9270:00:22,IC,,
1,9992013,BATCH_1000903_06,21,T,7,501283,,,,21121:99:15,|2,A,7790:05:24,IC,,

Please show some learning curve and present an attempt of your own - with 170 posts in here some basic understanding could be expected, no?

hi Rudic

earlier I tried

paste -sd "|" 

but this was combining the complete line as below .

 
 1,8091012,BATCH_1430903_01,21,T,2,808738,,,,21121:87:01,|2,A,79020:04:25,IC,,|2,D,89014:03:40,u,7239021674,|1,5390021,BATCH_1330903_02,21,T,2,306738,,,,21121:15:01,|2,A,6190:05:09,IC,,|1,9650123,BATCH_1110903_02,21,T,3,904428,,,,21121:34:11,|2,A,9270:00:22,IC,,|1,9992013,BATCH_1000903_06,21,T,7,50128,,,,21121:99:15,|2,A,7790:05:24,IC,,
 

One possible solution:

sed -n '/^1/ {1h; 1! {x; s/\n//g; p; }}; /^2/ {s/^/|/; H; }; $ {x; s/\n//g; p; } ' file
1,8091012,BATCH_1430903_01,21,T,2,808738,,,,21121:87:01,|2,A,79020:04:25,IC,,|2,D,89014:03:40,u,7239021674,
1,5390021,BATCH_1330903_02,21,T,2,306738,,,,21121:15:01,|2,A,6190:05:09,IC,,
1,9650123,BATCH_1110903_02,21,T,3,904428,,,,21121:34:11,|2,A,9270:00:22,IC,,
1,9992013,BATCH_1000903_06,21,T,7,50128,,,,21121:99:15,|2,A,7790:05:24,IC,,

Hello scriptor,

Could you please try following and let me know if this helps you.

awk '{printf("%s%s",$0~/^1/ && FNR>1?ORS:"",$0)} END{print ""}' Input_file

Thanks,
R. Singh

Hi Ravinder,

yes it works.
can you please explain me this command

Hello scriptor,

Following is the explanation which may help you on same.

 printf("%s%s",$0~/^1/ && FNR>1?ORS:"",$0) means, using printf command %s%s means telling printf that there are 2 strings
                                          to be passed to it. Then while passing value of 1st string checking condition 
                                          $0~/^1/ if a line starts from 1 and FNR>1 and it's line number is greater than 1
                                          if condition is TRUE then execute statements after ? if not then execute statement after :
                                          For 2nd string simply printing $0.
 

Thanks,
R. Singh

This minor extension of Ravindersingh13's proposal prints even the requested | separator:

awk '{printf("%s%s%s",/^2/?"|":"", /^1/ && FNR>1?ORS:"",$0)} END{print ""}' file

EDIT: or even

awk '{printf("%s%s",/^2/?"|":/^1/ && FNR>1?ORS:"",$0)} END{print ""}' file

EDIT: or even

awk '{printf("%s%s",/^1/?DL:"|",$0); DL=ORS} END{print ""}' file

Hi Rudic

you suggestion works

 
 sed -n '/^1/ {1h; 1! {x; s/\n//g; p; }}; /^2/ {s/^/|/; H; }; $ {x; s/\n//g; p; } ' file
 

in this can you please explain the working of below part.

/^1/ 

--> this will print lines starting with 1. only this part I understand

1h;
1!
{x; s/\n//g; p; }
/^2/ 
{s/^/|/; H; };
{x; s/\n//g; p;

whoever I googled and found below things but still scratching my head to understand.

(h)function copies the contents of the pattern space into a holding area 
(g) function copies the contents of the holding area into the pattern space, 
destroying the previous contents of the pattern space
(H) function appends the contents of the pattern space to the contents of the holding area.
(x)function interchanges the contents of the pattern space and the holding area.
 

also I will be grateful to your if suggest me how should I also learnt or understand so that I can too build similar 1 line coding.

---------- Post updated at 04:52 PM ---------- Previous update was at 04:48 PM ----------

HI Ravinder,

in your syntax

awk '{printf("%s%s",$0~/^1/ && FNR>1?ORS:"",$0)} END{print ""}'

what is the working of

$0~

over net I found it means

represents your home folder

but this doesn't fit in this case I guess.

Another one-liner with sed (all versions)

sed -e ':L' -e '$!N;/\n1,/{P;D;}' -e 's/\n/|/;tL' file

Better readable as multi-liner

sed '
  :L
  $!N
  /\n1,/{
    P;D
  }
  s/\n/|/
  tL
' file
1 Like

thx everyone
if guys help me understand this .
I am scratching my head since morning to understand Rudic's one liner
but fails.

Thank you.

Explaining sed 's intricate operation in depth exceeds my language capabilities as well probably space provided in here; on top, there's many texts on the topic in them there internet sites... once you're finished reading sed 's man and / or info pages as the principle sources of information.
In short, sed has a pattern space and a hold space; on the former all commands operate upon, the latter is only copied and / or appended to / from, or exchanged. The commands can be influenced by (ranges of) addresses, which themselves can be regex (important: man regex !) matches like /pattern/ ( /^1/ matches a char "1" in the first place of a line) or line numbers (1 is the first line in an input stream). Don't mix up the two! The most powerful sed command is s(ubstitute): s/\n//g globally substitutes the regex pattern \n (escape sequence interpreted as a <new line> char) with the empty string, effectively removing it.

L1 - reading - exercise - reading - exercise - goto L1

thx a lot Rudic for valuable suggestion.

however if you can only explain your syntax in details I will very thankful to you

sed -n '/^1/ {1h; 1! {x; s/\n//g; p; }}; /^2/ {s/^/|/; H; }; $ {x; s/\n//g; p; } ' file

How about you try to explain it as far as you get, and I / we will jump in and fill in the gaps and / or correct misperceptions. It might be worthwhile to use a paper slip and sketch pattern and hold space and their contents?

As you found out, it uses the hold space with the commands h H x
The hold space is a bit cumbersome because line 1 and the last line $ need special treatment. (Indeed my solution without the hold space does not need to handle these border cases.)
A bit odd is /^2/{ ... } , that selects lines that start with a 2; shouldn't it be not /^1,/ i.e. /^1,/!{ ... } ?

RudiC's idea (with hold space), optimized, as multi-liner with comments

# sed -n : no default print
sed -n '
# if line starts with 1, then
  /^1,/ {
# exchange with hold space
    x
# substitute embedded newlines with | and print if successful (in line 1 will not print)
    s/\n/|/gp
# jump to :L
    bL
  }
# else append to hold space
  H
  :L
# if last line then
  $ {
# exchange with hold space (or copy from hold space with g)
    x
# substitute embedded newlines with | and print
    s/\n/|/g; p
  }
' file

And my idea (pattern space only), as multi-liner with comments

sed '
  :L
# append next line
# (Unix sed: N in the last line exits without default print, needs $b;N or $!N)
  $!N
# if the next line begins with 1, then
  /\n1,/ {
# print and delete the current line; D jumps to next cycle
    P;D
  }
# substitute embedded newline with |
  s/\n/|/
# if successful then jump to :L
  tL
# default print happens here
' file

Hi Rudic

from your syntax

sed -n '/^1/ {1h; 1! {x; s/\n//g; p; }}; /^2/ {s/^/|/; H; }; $ {x; s/\n//g; p; } ' file

below is what I still not understand

{1h; 1!  ---->

not able to understand this part.
what does

1h

and

1!

means here and its working
below is what I understand.

 
 /^2/ {s/^/|/; H; ----> this part select line starting with 2 and appending pipe"|" at the starting.
H is putting this in holding area 
 
 
 s/\n//g ---> this part replace newline char with empty line
 

now my confusion is, if below is the file

 
 cat v
a\nb\nc\nd 
a b c d 
\n
xyx 
 xyz
 

then when why here sed command is not replacing newline char in this case when I

sed 's/\n//g' v

may be my question seem silly to you ... but only this way I can clear my doubt. as I do not hv no other means

1 Like

A \n an embedded newline character (not two consecutive characters \ and n).
sed loops over the lines in the input file, so the sed code only has one line in its "pattern space".
The \n character is only created after an append command like N or H or G or a.
Normally an RE can handle only one line, and ^ and $ mark the beginning and end of the line.
For example a line

line1

The RE can see

^line1$

After an append there is a \n between the two lines, like this

line1\nline2

An RE can see it like

^line1\nline2$
1 Like

Post#10 shows a very interesting approach by MadeInGermany, both in method as well as esp. in script length. Here's a revised version of mine stripped down to minimum length:

sed -n '/^1/bL; {s/^/|/;H;}; ${:L;x;s/\n//gp;}' file