Filter table of different length

Dear Forum,

I have to filter (e.g. PF=0.8) a text file according to some measured and recorded values for different fields (sensor). The files can be large and the recorded data points (ID) could differ in length. I have worked out a solution but it is very messy and not flexible. Does anybody have a better, cleaner and more robust idea to solve this problem?

Input File:

ID_AB1\tsensor01(1.0),sensor02(0.6),sensor03(0.5),sensor04(0.45)
ID_AB2\tsensor01(1.0),sensor02(0.95),sensor03(0.90),sensor04(0.80)
ID_AC3\tsensor01(1.0),sensor02(1.0),sensor03(1.0),sensor04(1.0)
ID_AD1\tsensor01(0.9),sensor02(0.6)
ID_BA2

Output File needed:

ID_AB1;sensor01;low;low;low
ID_AB2;sensor01;sensor02;sensor03;low
ID_AC3;sensor01;sensor02;sensor03;sensor04
ID_AD1;sensor01;low;na;na
ID_BA2;na;na;na;na

I used to following commands but it does not works for records of shorter length.

sed 's/(/,/g' out.singnal |\
 sed 's/)//g' |\
 awk -F"\t|," -v PF=0.8 '{
   printf "%s;", $1
  if($3>=PF && $5>=PF && $7>=PF && $9>=PF)
          printf " %s; %s; %s; %s\n", $2,$4,$6,$8
  else if($3>=PF && $5>=PF && $7>=PF && $9<PF)
          printf " %s; %s; %s; %s\n", $2,$4,$6,"low"
  else if($3>=PF && $5>=PF && $7<PF && $9<PF)
          printf " %s; %s; %s; %s\n", $2,$4,"low","low"
  else if($3>=PF && $5<PF && $7<PF && $9<PF)
          printf " %s; %s; %s; %s\n", $2,"low","low","low"
  else if($3<PF && $5<PF && $7<PF && $9<PF)
          printf " %s; %s; %s; %s\n", "low","low","low","low"
 }'

I add a "length control" to it but to cover all cases it would be messy :eek:.

sed 's/(/,/g' out.singnal |\
 sed 's/)//g' |\
 awk -F"\t|," -v PF=0.8 '{
  printf "%s;", $1
  if(NF==9 && $3>=PF && $5>=PF && $7>=PF && $9>=PF)
          printf " %s; %s; %s; %s\n", $2,$4,$6,$8
  else if(NF==9 && $3>=PF && $5>=PF && $7>=PF && $9<PF)
          printf " %s; %s; %s; %s\n", $2,$4,$6,"low"
  else if(NF==9 && $3>=PF && $5>=PF && $7<PF && $9<PF)
          printf " %s; %s; %s; %s\n", $2,$4,"low","low"
  else if(NF==9 && $3>=PF && $5<PF && $7<PF && $9<PF)
          printf " %s; %s; %s; %s\n", $2,"low","low","low"
  else if(NF==9 && $3<PF && $5<PF && $7<PF && $9<PF)
          printf " %s; %s; %s; %s\n", "low","low","low","low"
  else if(NF==7 && $3>=PF && $5>=PF && $7>=PF)
          printf " %s; %s; %s; %s\n", $2,$4,$6,"na"
  else if(NF==7 && $3>=PF && $5>=PF && $7<PF)
          printf " %s; %s; %s; %s\n", $2,$4,"low","na"
  else if(NF==7 && $3>=PF && $5<PF && $7<PF)
          printf " %s; %s; %s; %s\n", $2,"low","low","na"
  else if(NF==7 && $3<PF && $5<PF && $7<PF)
          printf " %s; %s; %s; %s\n","low","low","low","na"
 }'

What output do you want?

Hi Corona688,

the output should be a text file

ID_AB1;sensor01;low;low;low
ID_AB2;sensor01;sensor02;sensor03;low
ID_AC3;sensor01;sensor02;sensor03;sensor04
ID_AD1;sensor01;low;na;na
ID_BA2;na;na;na;na

Thank for considering my problem

awk '
{lines[NR]=$0; if (NF > columns) columns=NF}
END {
   for (j=1; j<=NR; j++) {
      split(lines[j], sensors, FS);
      $0=sensors[1];
      for (i=2; i<=columns; i++) {
         split(sensors, fields, "[()]");
         $0=$0 OFS ((fields[2] <= PF) ? (length(sensors) ? "low" : "na") : fields[1]);
      }
      print $0;
   }
}
' FS="[,\t]" OFS=";" PF=0.80 input_file
2 Likes

Dear rdrtx1,

thank you for your help.