Program to combine two lines in a file on checking the first character of each line

Hi,

I have a requirement where I need to combine two lines in a file based on first character of each line in a file.

Please find the sample content of the file below:

_______________________
5, jaya, male, 4-5-90, single
smart
6, prakash, male, 5-4-84, married
fair
7, raghavi, female, 12-10-85, married
calm
talented
9, bhaskar, male, 29-12-92, single
studios
________________________

I want the output for this file as
_______________________

5, jaya, male, 4-5-90, single-smart
6, prakash, male, 5-4-84, married-fair
7, raghavi, female, 12-10-85, married-calm-talented
9, bhaskar, male, 29-12-92, single-studios

________________________

can you please help me with the shell program that achieve my requirement?

What have you tried so far? If you have some code we can help you get it working.

1 Like

Hi Jim,

I'm new to the Shell scripting. Not sure on how to start. But I tried a piece of code and I'm sure that its completely wrong :frowning:
_______________
#!/bin/ksh

for i in '/data/informatica/HQ/APSSR01/SrcFiles/tgt_detail_import.dat| grep -o ^.'
do

case $i in
[0-9]*) var= 1;;
if [$var= 1] then 'sed 'N;s/\n//''
fi
esac
done
________________
Please help me by creating a new program.

Regards,
Jaya

Hello jayaP,

Please use code tags as per forum rules in your posts for Commands/Inputs/Codes which you use into your post.
Following may help you in same too.

awk '($0 !~ /^[0-9]/){A=A?A OFS $0:$0;next} ($0 ~ /^[0-9]/ && A){print A;A=""} {A=$0} END{print A}' OFS="-"  Input_file

Output will be as follows.

5, jaya, male, 4-5-90, single-smart
6, prakash, male, 5-4-84, married-fair
7, raghavi, female, 12-10-85, married-calm-talented
9, bhaskar, male, 29-12-92, single-studios
 

On a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk , /usr/xpg6/bin/awk , or nawk .
EDIT: Adding a non-one liner form of solution here too.

awk '($0 !~ /^[0-9]/)           {
                                        A=A?A OFS $0:$0;
                                        next
                                }
     ($0 ~ /^[0-9]/ && A)       {
                                        print A;
                                        A=""
                                }
                                {
                                        A=$0
                                }
     END                        {
                                        print A
                                }
    ' OFS="-"   Input_file
 

Thanks,
R. Singh

1 Like

Hi Ravinder,

Thanks for your solution. I'm testing it at my end. Can you please let me know if we can write the output of the awk program to a new file?

Regards,
Jaya

Hello Jaya,

Yes, you could write the output of code into a new file as follows.

awk '($0 !~ /^[0-9]/){A=A?A OFS $0:$0;next} ($0 ~ /^[0-9]/ && A){print A;A=""} {A=$0} END{print A}' OFS="-"  Input_file > Output_file

Following is a non-one liner form of solution too.

awk '($0 !~ /^[0-9]/)           {
                                        A=A?A OFS $0:$0;
                                        next
                                }
     ($0 ~ /^[0-9]/ && A)       {
                                        print A;
                                        A=""
                                }
                                {
                                        A=$0
                                }
     END                        {
                                        print A
                                }
    ' OFS="-"   Input_file  > Output_file

In both of the above codes, output will be stored in file named Output_file.
Hope this helps you.

Thanks,
R. Singh

How about

awk '{printf "%s%s", /^[0-9]+/?XS:"-", $0; XS = RS} END {printf RS}' file
5, jaya, male, 4-5-90, single-smart
6, prakash, male, 5-4-84, married-fair
7, raghavi, female, 12-10-85, married-calm-talented
9, bhaskar, male, 29-12-92, single-studios

Hi Ravinder/Rudi,

What if my file content looks like this

5, jaya, male, 4-5-90, single
-smart
6, prakash, male, 5-4-84, married
-fair
7, raghavi, female, 12-10-85, married
calm
-talented
9, bhaskar, male, 29-12-92, single
-studios
___________

Please suggest and also can you please explain your code?

Regards,
Jaya

Hello Jaya,

Could you please try following and let me know if this helps you.

awk '($0 !~ /^[0-9]/){A=A?A OFS $0:$0;next} ($0 ~ /^[0-9]/ && A){print A;A=""} {A=$0} END{print A}' OFS=""  Input_file

Output will be as follows.

5, jaya, male, 4-5-90, single-smart
6, prakash, male, 5-4-84, married-fair
7, raghavi, female, 12-10-85, marriedcalm-talented
9, bhaskar, male, 29-12-92, single-studios
 

Explanation for above code is as follows too.

awk '($0 !~ /^[0-9]/){  ##### Checking if line is NOT starting from digits or not here.
A=A?A OFS $0:$0;        ##### creating a variable named A which will having current line's value appending to itself whenever above condition will be TRUE.
next}                   ##### skipping all statements now by using next keyword.
($0 ~ /^[0-9]/ && A){   ##### Checking if line is starting from digits or not here.
print A;                ##### printing the variable A's value now.
A=""}                   ##### Nullifying the value of variable A now.
{A=$0}                  ##### taking current line's value to variable A now.
END{                    ##### starting END block now.
print A}                ##### printing the variable A's value now.
' OFS=""  Input_file    ##### setting OFS(Output field separator) value to NULL here and mentioning Input_file name here too.
 

Thanks,
R. Singh

What if you explained what you then need/expect?

Hi Ravinder/Rudi,

Both the codes are not impacting the file and copying the file as is.

5, jaya, male, 4-5-90, single
-smart
6, prakash, male, 5-4-84, married
-fair
7, raghavi, female, 12-10-85, married
calm
-talented
9, bhaskar, male, 29-12-92, single
-studios
5, jaya, male, 4-5-90, single-smart
6, prakash, male, 5-4-84, married-fair
7, raghavi, female, 12-10-85, marriedcalm-talented
9, bhaskar, male, 29-12-92, single-studios

I'm pasting my sample original file below

5,1,32784506,6/3/2015,6/3/2015,9:15:22,6/3/2015,9:15:34,0,16,0,Validate
Class [TGWFTRNBO]],,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
6,1,32784506,6/3/2015,6/3/2015,9:15:22,6/3/2015,9:15:34,0,16,0,Validate
Class [TGWFTRNBO]],,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
7,1,32784506,6/3/2015,6/3/2015,9:15:22,6/3/2015,9:15:34,0,16,0,Validate
Class [TGWFTRNBO]],,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
The operative account
10,1,32784506,6/3/2015,6/3/2015,9:15:22,6/3/2015,9:15:34,0,16,0,Validate
Class [TGWFTRNBO]],,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,

It is ok even if the commas in second line is concatenated with before line.

---------- Post updated at 07:32 PM ---------- Previous update was at 07:16 PM ----------

Guys-

Ravinder and Rudi. You are simply AWESOME!! Hats OFF

The script you provided is working great. Thanks a lot.

Regards,
Jaya

---------- Post updated at 07:42 PM ---------- Previous update was at 07:32 PM ----------

Ravinder,

It is working fine with your script. One more need is that, I want to remove those commas followed by the line starting with number while concatenating.

Sample input file

5,1,32784506,6/3/2015,6/3/2015,9:15:22,6/3/2015,9:15:34,0,16,0,Validate
Class [TGWFTRNBO]],,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
6,1,32784506,6/3/2015,6/3/2015,9:15:22,6/3/2015,9:15:34,0,16,0,Validate
Class [TGWFTRNBO]],,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
7,1,32784506,6/3/2015,6/3/2015,9:15:22,6/3/2015,9:15:34,0,16,0,Validate
Class [TGWFTRNBO]],,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
The operative account
10,1,32784506,6/3/2015,6/3/2015,9:15:22,6/3/2015,9:15:34,0,16,0,Validate
Class [TGWFTRNBO]],,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,

output should be like

5,1,32784506,6/3/2015,6/3/2015,9:15:22,6/3/2015,9:15:34,0,16,0,ValidateClass[TGWFTRNBO]]
6,1,32784506,6/3/2015,6/3/2015,9:15:22,6/3/2015,9:15:34,0,16,0,ValidateClass[TGWFTRNBO]]
7,1,32784506,6/3/2015,6/3/2015,9:15:22,6/3/2015,9:15:34,0,16,0,ValidateClass[TGWFTRNBO]]The operative account
10,1,32784506,6/3/2015,6/3/2015,9:15:22,6/3/2015,9:15:34,0,16,0,ValidateClass [TGWFTRNBO]]

Regards,
Jaya

Hello jayaP,

Please use code tags for showing inputs also which you are using into your posts too.
Could you please try following and let me know if this helps you.

awk '($0 !~ /^[0-9]/){sub(/,+$/,X,A);A=A?A OFS $0:$0;next} ($0 ~ /^[0-9]/ && A){sub(/,+$/,X,A);print A;A=""}{A=$0}END{sub(/,+$/,X,A);print A}' OFS= ""  Input_file

Output will be as follows.

5,1,32784506,6/3/2015,6/3/2015,9:15:22,6/3/2015,9:15:34,0,16,0,ValidateClass [TGWFTRNBO]]
6,1,32784506,6/3/2015,6/3/2015,9:15:22,6/3/2015,9:15:34,0,16,0,ValidateClass [TGWFTRNBO]]
7,1,32784506,6/3/2015,6/3/2015,9:15:22,6/3/2015,9:15:34,0,16,0,ValidateClass [TGWFTRNBO]]The operative account
10,1,32784506,6/3/2015,6/3/2015,9:15:22,6/3/2015,9:15:34,0,16,0,ValidateClass [TGWFTRNBO]]
 

Thanks,
R. Singh

Try also

awk '{sub (/,+$/,_); printf "%s%s", /^[0-9]+/?XS:"", $0; XS = RS} END {printf RS}' file
5,1,32784506,6/3/2015,6/3/2015,9:15:22,6/3/2015,9:15:34,0,16,0,ValidateClass [TGWFTRNBO]]
6,1,32784506,6/3/2015,6/3/2015,9:15:22,6/3/2015,9:15:34,0,16,0,ValidateClass [TGWFTRNBO]]
7,1,32784506,6/3/2015,6/3/2015,9:15:22,6/3/2015,9:15:34,0,16,0,ValidateClass [TGWFTRNBO]]The operative account
10,1,32784506,6/3/2015,6/3/2015,9:15:22,6/3/2015,9:15:34,0,16,0,ValidateClass [TGWFTRNBO]]