Capturing the string and the string below it

bha148 · August 3, 2010, 5:11am

Hi,
I want to read the following input and want to produce an output as given below.

Input:

  CAR                DESIGN          COLOUR        SERVICE 
  MERZ              APPLE             RED              2 YEARS
                       ORANGE
                       GRAPE            
   
  VW                                      WHITE          0 YEAR
   
  AUDI               MANGO             BLUE              1 YEAR

output:

MERZ , APPLE , ORANGE ,  GRAPE , RED , 2 YEARS
VW , WHITE , 0 YEAR
AUDI , MANGO , BLUE , 1 YEAR

Kindly need your help on this.
Thanks in advance.

Yogesh_Sawant · August 3, 2010, 5:38am

how about:

awk '{print $1 ", " $2 ", " $3 ", " $4 " " $5 }' file.txt | grep -v "CAR"

bha148 · August 3, 2010, 5:47am

Thanks for the reply... One problem with the script...APPLE, ORANGE & GRAPE should come in the 1st line but the script is reading line by line, it should group the various designs under the car brand in single line.

pravin27 · August 3, 2010, 6:59am

Hi,
try this,

#!/usr/bin/perl

undef $\;
open (FH,"<","/dir/cardetails");

while (<FH>) {
if ($. == 1 ) { next ; }
if (/^\s+$/) { next ; }
if (/^\s\s(\w+)\s+(\w+)\s+(\w+)\s+(.*)/) {
if ( defined ($f3) ) {
print " $f3,$f4 \n";
}
print "$1,$2,";
$f3=$3; $f4=$4;
}
else {
chomp;
print "$_,";
}
}
if ( defined ($f3) ) {
print "$f3,$f4 \n";
}
close(FH);

Franklin52 · August 3, 2010, 7:27am

Or:

awk '/YEAR/{$1=$1;print}' OFS=", " file |sed 's/\(.*\),\(.*\)/\1\2/'

pravin27 · August 3, 2010, 7:42am

Hi Franklin,

Could you please explain the code ?

bha148 · August 3, 2010, 8:20am

Hi Franklin,
Thanks for the reply. Your code gives the output like this...

MERZ, APPLE, RED, 2 YEARS
VW, WHITE, 0 YEAR
AUDI, GREEN, BLUE, 1 YEAR

script is not capturing the 'orange' and 'grape' fields in the first line output...

Franklin52 · August 3, 2010, 8:55am

awk '/YEAR/{$1=$1;print}' OFS=", " file | sed 's/\(.*\),\(.*\)/\1\2/'

/YEAR/ -> select lines with "YEAR"
$1=$1 -> reset the field separators (remove double spaces)..
print -> ..and print the line..
OFS=", " -> .. with the new output fieldseparator

sed 's/\(.*\),\(.*\)/\1\2/'

The sed command removes the last comma of the output of the awk command.

$.*$,$.$ -> selects two patterns $.$ of the string, the 1st pattern is the line until the last comma (greedy match) and the 2e pattern the line after the last comma.
\1\2 -> prints the 2 patterns.

---------- Post updated at 14:55 ---------- Previous update was at 14:25 ----------

bha148:

Hi Franklin,
Thanks for the reply. Your code gives the output like this...
MERZ, APPLE, RED, 2 YEARS
VW, WHITE, 0 YEAR
AUDI, GREEN, BLUE, 1 YEAR
script is not capturing the 'orange' and 'grape' fields in the first line output...

Sorry, I misread the question...in this case the code shouldn't work.

Did you try the other solutions above?

Difficult to automate this with a bad structured file, you could have more pitfalls in your file...