Joinging multiple lines that are backslashed

I have the following input:

1 2 3 \
4 5

1 2 3 4 5 6 7 \
8 \
9 10

And I want to end up with the following:

1 2 3 4 5
1 2 3 4 5 6 7 8 9 10

In other words, how can I join multiple records/lines (2 or more) where the backslash character has been used to extend across multiple lines - using standard UNIX tools such as sed, awk, grep, sh, etc.

The actual application in this instance is joining backslashed multiple lines from the sudoers file.

  • CDM
cat abc.txt | perl -e '
while (<>){ 
chomp;
if (m/\\/) {
      s/\\//g; print "$_" }
 else { 
     print "$_\n";
}
}'

HTH,
PL

Using awk:

$ awk '
{
    while ($0 ~ /\\$/ ) {
                              getline record;
                              $0 = $0 record
                              }
     print $0
}
' input.txt  | awk '{gsub(/\\/,"")};1'
awk '{if ($0~/\\$/) {gsub(/\\/,"") ;printf $0} else {print $0}}' abc.txt

Technically there should be a space where the join took place, so IMO the \ and \n should be replaced with a space instead of just deleted:

sed ':a;N;s/\\\n/ /;ba' infile

This seems to work fine (although only with gsed and not the sed that comes with Solaris 10).

Some of the lines that follow the backslah contain leading spaces/tabs and the backslahes are also preceded by a space. This is what I've found has worked, although it doesn't look particularly pretty:

# sed 's/[1]//g' /etc/sudoers | \ # space & tab between brackets
gsed ':a;N;s/\\\n/ /;ba' | \
sed 's/,[ ]
/,/g' # space & tab between brackets

Can I get all of this into a single sed statement?

Also, could you explain what't happening in the gsed statement you piointed out?

  • CDM

  1. ↩︎

This will fare better on Solaris I think -

sed -e :a -e '/\\$/N; s/\\\n//; ta'

a\
b
becomes ab - no space.
a \
b
becomes a b - one space

This should get rid of spaces around the backslash:

sed ':a;N;s/[ \t]*\\\n[ \t]*/ /;ba' infile

It creates a label a ( a: )
It then joins two lines ( N )
After that leadings space before and after a \ followed by a linefeed (\n) get replaced by a single space
Then it branches (b) to label a.

This works great. Thanks :slight_smile:

OK, you've given me pretty much exactly what I asked for but this has only highlighted the fact that I mis-stated exactly what I wanted. Ultimately, I'm trying to generate a list of hosts that are represented in the sudoers file. This sed statement (which actually only works if I use gsed on my Solaris 10 host) now leaves me with something like this:

Host_Alias ALIAS1=host1,host2,host3

Host_Alias ALIAS2=host1,host3,host4,host5
Host_Alias ALIAS3=host1,host5,host6

etc.

OK, I can removed blank lines with grep or awk and I can chop off everything up to the = character with another sed statement and I might use tr, for example, to replace all the , characters with a newline character and I'm sure this would all get me what I want.

But ... could it ALL be done in a single command? List every host that's represented in the sudoers file?

  • CDM

---------- Post updated at 11:22 AM ---------- Previous update was at 11:11 AM ----------

This is the shortest pipeline I can come up with:

# gsed ':a;N;s/[ \t]\\\n[ \t]*//;ba;s/=//g' /etc/sudoers | \
awk -F= '/^Host_Alias/ {print $NF}' | \
tr ',' '\n' | \
sort | \
uniq

Alternatively:

sed ':a;N;s/[ \t]*\\\n[ \t]*//;ba' /etc/sudoers|sed -n 's/^Host_Alias.*=//;s/,/\n/gp'|sort -u

This doesn't filter out all the non Host_Alias entries in the file (sorry, that's me not stating the problem correctly again).

What's the significance of the -n and the p after the g?

  • CDM

How about this?

sed ':a;N;s/[ \t]*\\\n[ \t]*//;ba' /etc/sudoers|sed -n '/^Host_Alias/{s/.*=//;s/,/\n/gp}'|sort -u

-n means do not print pattern space automatically
p means print the pattern space

Hmmm. OK, this works fine (but, again, only if I use gsed) but it does not capture entries that have only a single host listed like this:

Host_ALias ALIAA8=host12

I suspect because of the assumption of a comma being present on the line?

  • CDM

I can do this:

sed ':a;N;s/[ \t]*\\\n[ \t]*//;ba' /etc/sudoers |
sed '/^Host_Alias/!d;{s/.*=//;s/,/\n/g;s/$/\n/p}'|sort -u

But that does not make it any easier and it prints a blank line

Or this:

awk -F '[, \t \\\]*' '/Host_Alias/,!/\\/{sub(/^Host_Alias.*=/,"");print}' /etc/sudoers |
grep -o '\w\+' |sort -u

But I don't see how this is an improvement really on what you created yourself (yawn).