awk or sed - Convert 2 lines to 1 line

guinch · April 27, 2012, 6:52pm

Hi,

Just trying to get to grips with sed and awk for some reporting for work and I need some assistance:

I have a file that lists policy names on the first line and then on the second line whether the policy is active or not.

Policy Name:       Policy1
Active:            yes
Policy Name:       Policy2
Active:            yes
Policy Name:       Policy3
Active:            no
Policy Name:       Policy4
Active:            yes
Policy Name:       Policy5
Active:            no

What I'm trying to get to is a list of policy names and whether or not they are active on the same line:

Policy1 yes
Policy2 yes
Policy3 no
Policy4 yes
Policy5 no

Would appreciate any pointers.

Thanks.

jim_mcnamara · April 27, 2012, 7:18pm

awk 'NR%2 {keep=$2; next}
       !NR%2 {print keep, $2} ' inputfile > outputfile

jgt · April 27, 2012, 7:43pm

Or if you are having problems with awk.

while read a b
do
   read c d
   echo $b $d
done <inputfile

drl · April 27, 2012, 8:44pm

Hi.

An alternate with sed , cut , paste:

#!/usr/bin/env bash

# @(#) s1	Demonstrate combine lines fro specific column (field).

pe() { for _i;do printf "%s" "$_i";done; printf "\n"; }
pl() { pe;pe "-----" ;pe "$*"; }
db() { ( printf " db, ";for _i;do printf "%s" "$_i";done;printf "\n" ) >&2 ; }
db() { : ; }
C=$HOME/bin/context && [ -f $C ] && $C sed cut paste

FILE=${1-data1}
pl " Input data file $FILE:"
cat -A $FILE

pl " Results of sed, cut, paste:"
sed 's/   */\t/g' $FILE |
cut -f2 |
paste - -

pl " Same thing, compressed with \"Process Substitution\":"
cut -f2 <( sed 's/   */\t/g' $FILE ) |
paste - -

exit 0

producing:

% ./s1

Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 2.6.26-2-amd64, x86_64
Distribution        : Debian GNU/Linux 5.0.8 (lenny) 
bash GNU bash 3.2.39
sed GNU sed version 4.1.5
cut (GNU coreutils) 6.10
paste (GNU coreutils) 6.10

-----
 Input data file data1:
Policy Name:       Policy1$
Active:            yes$
Policy Name:       Policy2$
Active:            yes$
Policy Name:       Policy3$
Active:            no$
Policy Name:       Policy4$
Active:            yes$
Policy Name:       Policy5$
Active:            no$

-----
 Results of sed, cut, paste:
Policy1	yes
Policy2	yes
Policy3	no
Policy4	yes
Policy5	no

-----
 Same thing, compressed with "Process Substitution":
Policy1	yes
Policy2	yes
Policy3	no
Policy4	yes
Policy5	no

The sed converts 3 or more blanks to TABs, the cut extracts column (field) 2, and the paste combines 2 lines into one.

If the whitespace in the results does not show up on your display, rest assured that the tokens are separated by a TAB. You can copy and paste them to see it.

See man pages for details.

Best wishes ... cheers, drl

( edit 1: correct minor spelling errors )

Scrutinizer · April 28, 2012, 4:05am

awk '{ORS=(ORS==RS)?FS:RS; print $NF}' infile

elixir_sinari · April 28, 2012, 4:51am

Using cat, cut, tr and paste:

cat filename|cut -d':' -f2|tr -d ' '|paste -s -d" \n" -

Klashxx · April 28, 2012, 6:20am

A perl:

perl -ne '/Policy Name:\s*(\S+)/;$pol=$1;print $pol." ".$1."\n" if /Active:\s*(\S+)/' infile

guinch · April 28, 2012, 6:32am

Thanks to everyone for their help.

The perl solution is the only one I can get to work with my actual data. I think its more to do with the format of the data I have and not your code as I can get most of them working with the sample data I provided. I've spend hours trying to figure out what it is about the real data that the code doesn't like.

One thing that would be really helpfull if anyone has the time is brief breakdown of what the code is doing so i can learn a bit more about it. Especially the awk variable stuff like NR, FR, ORS etc. I spent a while googling and found loads of examples but not much explaining them. The one that really has me is in Jims code there is "NR%2". What is the "%" doing?

Again thanks for your help.

Scrutinizer · April 28, 2012, 6:40am

Sure:

awk '
{
  ORS=(ORS==RS)?FS:RS    # If the output record separator is equal to RS ("\n") then set it to a FS ( " " ) else set it to RS ("\n"), thus alternating between newline and space.
  print $NF              # print the last field of the line. ORS determines if a space or a newline gets printed afterwards..
}' infile