Filter and sort the file using awk

I have file and process it and provide clean output.

input file

Device Symmetrix Name    : 000A4
Device Symmetrix Name    : 000A5
Device Symmetrix Name    : 000A6
Device Symmetrix Name    : 000A7
Device Symmetrix Name    : 000A8
Device Symmetrix Name    : 000A9
Device Symmetrix Name    : 000AA
Device Symmetrix Name    : 000AB
    Device Symmetrix Name                  : 000AB
    RDF Type                               : R1
    Remote Device Symmetrix Name           : 000A1
    Remote Symmetrix ID                    : frame1
Device Symmetrix Name    : 000AC
    Device Symmetrix Name                  : 000AC
    RDF Type                               : R1
    Remote Device Symmetrix Name           : 000A2
    Remote Symmetrix ID                    : frame1
Device Symmetrix Name    : 000AD
    Device Symmetrix Name                  : 000AD
    RDF Type                               : R1
    Remote Device Symmetrix Name           : 000A3
    Remote Symmetrix ID                    : frame1

I need output as below

First i want to remove all the device which doest dont have R1 Device
Second on need to paste in one line and remove all blank space

Need output like this

Device Symmetrix Name : 000AB,RDF Type : R1,Remote Device Symmetrix Name : 000A1,Remote Symmetrix ID : frame1
Device Symmetrix Name : 000AC,RDF Type : R1,Remote Device Symmetrix Name : 000A2,Remote Symmetrix ID : frame1
Device Symmetrix Name : 000AD,RDF Type : R1,Remote Device Symmetrix Name : 000A3,Remote Symmetrix ID : frame1

What have you tried?

I am not getting any idea how to get this output doing manual.

Hello ranjancom2000,

I have a few to questions pose in response first:-

  • What have you tried so far?
  • What output/errors do you get?
  • What OS and version are you using?
  • What are your preferred tools? (C, shell, perl, awk, etc.)
  • What logical process have you considered? (to help steer us to follow what you are trying to achieve)

Most importantly, What have you tried so far?

There are probably many ways to achieve most tasks, so giving us an idea of your style and thoughts will help us guide you to an answer most suitable to you so you can adjust it to suit your needs in future.

Just saying "I don't know" or similar doesn't help us to help you much. If you were to do this as a human, what steps would you follow to process the data?

We're all here to learn and getting the relevant information will help us all.

Thanks, in advance,
Robin

thanks for the support i found the way to filter. Need to know is there any simpler way

 grep -i -v "    Device Symmetrix Name    :"|tr -d "[:blank:]"|paste -d, - - - -
  

Try also

awk '
!$1     {gsub (/[       ]+/, " ")
         printf "%s,", $0
         TRS = ORS
        }
$1      {printf TRS
        }
END     {printf TRS
        }
' FS="[ 	]*" file
1 Like

Hello ranjancom2000,

Could you please try following and let me know if this helps you.

awk '/^Device Symmetrix Name/{next} /^ +Device Symmetrix Name/{flag=1;if(ac_flag){print ""};sub(/^ +/,"");sub(/ +:/,":");val=$0;ac_flag="";next} /RDF Type.*R1/ && flag{sub(/^ +/,"");sub(/ +:/,":");printf("%s %s ",val,$0);flag="";ac_flag=1;next} ac_flag{sub(/^ +/,"");sub(/ +:/,":");printf(",%s",$0)}END{print ""}'    Input_file

Output will be as follows.

Device Symmetrix Name: 000AB RDF Type: R1 ,Remote Device Symmetrix Name: 000A1,Remote Symmetrix ID: frame1
Device Symmetrix Name: 000AC RDF Type: R1 ,Remote Device Symmetrix Name: 000A2,Remote Symmetrix ID: frame1
Device Symmetrix Name: 000AD RDF Type: R1 ,Remote Device Symmetrix Name: 000A3,Remote Symmetrix ID: frame1

EDIT: Adding a non-one liner form of solution too now.

awk '
/^Device Symmetrix Name/{
  next
}
/^ +Device Symmetrix Name/{
  flag=1;
  if(ac_flag){
    print ""
};
  sub(/^ +/,"");
  sub(/ +:/,":");
  val=$0;
  ac_flag="";
  next
}
/RDF Type.*R1/ && flag{
  sub(/^ +/,"");
  sub(/ +:/,":");
  printf("%s %s ",val,$0);
  flag="";
  ac_flag=1;
  next
}
ac_flag{
  sub(/^ +/,"");
  sub(/ +:/,":");
  printf(",%s",$0)
}
END{
  print ""
}'    Input_file
 

Thanks,
R. Singh

1 Like

thanks both. I got this working by using this filter

grep -i -v "    Device Symmetrix Name    :"|tr -d "[:blank:]"|paste -d, - - - -

Sure? When I run your above command, the output is not near to what you requested in post#1.

Hi RUdic,

Yes you r right. I get different output but i have all the detail what i need. Below is output i got

0005C RDFType:R1,RemoteDeviceSymmetrixName:00025,RemoteSymmetrixID: Frame1
0005D RDFType:R1,RemoteDeviceSymmetrixName:00026,RemoteSymmetrixID: Frame1
0005E RDFType:R1,RemoteDeviceSymmetrixName:00028,RemoteSymmetrixID: Frame1

But, the output you get is nothing like the output you expected. You can't arbitrarily join four lines together, out of context, and get the output you expect!

But if you're happy... :slight_smile:

Another one:

$ grep --no-group-separator -B1 -A2 "^[[:space:]].*R1$" file | paste - - - - | awk '$1=$1'
Device Symmetrix Name : 000AB RDF Type : R1 Remote Device Symmetrix Name : 000A1 Remote Symmetrix ID : frame1
Device Symmetrix Name : 000AC RDF Type : R1 Remote Device Symmetrix Name : 000A2 Remote Symmetrix ID : frame1
Device Symmetrix Name : 000AD RDF Type : R1 Remote Device Symmetrix Name : 000A3 Remote Symmetrix ID : frame1

This is around three+ times quicker than the pure awk solutions - grep is more efficient at doing the filtering, before the results get to paste or awk. For example, taking your input x 1,000,000:

This:

real	0m5.986s
user	0m4.988s
sys	0m0.818s

Respective awk solutions:

(Ravinder)
real	0m17.374s
user	0m16.626s
sys	0m0.547s

(RudiC)
real	0m28.287s
user	0m27.426s
sys	0m0.597s