Filtering Multiple variables from a single column

Hi,

I am currently filtering a file, "BUILD_TIMES", that has multiple column of information in it. An example of the data is as follows;

Fri Nov 5 15:31:33 2010 00:28:17 R7_BCGNOFJ_70.68
Fri Nov 5 20:57:41 2010 00:07:21 R7_ADJCEL_80.6
Wed Nov 10 17:33:21 2010 00:01:13 R7_BCTTEST3_80.1X

I am using the "awk" command to filter the columns so i get the following output;

Fri Nov 5 2010 00:28:17 BCGNOFJ
Fri Nov 5 2010 00:07:21 ADJCEL
Wed Nov 10  2010 00:01:13 BCTTEST3

The awk command i am using is as follows;

cat $BUILD_TIMES | awk '{split($NF,a,"_");print $1,$2,$3,$5,$6,a[2]}' >> $FINAL_LIST

Is there any way to filter out the "X" in the "R7_BCTTEST3_80.1X" column so that i would get the output in this format;

Wed Nov 10 2010 00:01:13 BCTTEST3 X

or in this format

Wed Nov 10 2010 00:01:13 BCTTEST3X

Any help would be greatly appreciated.

Thanks in advance.

sed method :slight_smile:

# sed 's/[0-9]*:[0-9]*:[0-9]* //;s/\(.*\) .*_\(.*\)_.*\(.\)$/\1 \2 \3/' infile 
Fri Nov 5 2010 00:28:17 BCGNOFJ 8
Fri Nov 5 2010 00:07:21 ADJCEL 6
Wed Nov 10 2010 00:01:13 BCTTEST3 X
sed 's/[^ \t]*[ \t]//4;s/[^ \t_]*_//;s/_.*\(.\)$/ \1/;s/[^X]$//' infile
Fri Nov 5 2010 00:28:17 BCGNOFJ
Fri Nov 5 2010 00:07:21 ADJCEL
Wed Nov 10 2010 00:01:13 BCTTEST3 X

---------- Post updated at 14:34 ---------- Previous update was at 14:27 ----------

awk -F'[ \t_]*' '{print $1,$2,$3,$5,$6,$8($9~/X$/?" X":x)}' infile

Hi Scrutinizer,

I used your sed command and it works very well except i still have one problem. The three lines of code

Fri Nov 5 15:31:33 2010 00:28:17 R7_BCGNOFJ_70.68
Fri Nov 5 20:57:41 2010 00:07:21 R7_ADJCEL_80.6
Wed Nov 10 17:33:21 2010 00:01:13 R7_BCTTEST3_80.1X

i showed earlier are part of a 1001 line file. The format of each line is exactly the same. However, when i run the sed command the first 180 or so lines come out like the following;

Sat Oct  06:37:31 2010    00:30:21 CMS
Sat Oct  06:38:48 2010    00:30:24 WRANMOMFJ
Sat Oct  06:40:30 2010    00:33:14 WRANMOMFJ
...

i.e. the time of day and not the date is being printed. The rest of the lines come out perfectly. Any suggestions on how to fix this? Could i use a "for" loop or a "do" loop to go through each individual line?

Any help would be greatly appreciated.

Thanks in advance :slight_smile:

Try:

sed 's/[^ \t][^ \t]*[ \t]//4;s/[^ \t_]*_//;s/_.*\(.\)$/ \1/;s/[^X]$//' infile

Does the awk work?

1 Like

That is perfect Scrutinizer. Is there a site where i could find out how each part of that command works?

The awk command did not work i kept getting a syntax error.

Thank you so much again though for the help.

Though I did not try your command, I know there are some issues with
calling awk on some platforms. Therefore, if you are porting your code
to different platforms you may want to call this function than replace awk with $AWK

 
which_awk()
{
   if [ -n "$SCRIPT_PATH" ]
   then
         eval $SCRIPT_PATH
   fi
   whence nawk > /dev/null
   if [ $? -eq 0 ]
   then
      AWK=nawk
   else
      whence awk > /dev/null
      if [ $? -eq 0 ]
      then
         AWK=awk
      else
         AWK=gawk
      fi
   fi
}

 
sed 's/[0-9]*:[0-9]*:[0-9]* //;/.*X$/s/\(.*\) .*_\(.*\)_.*\(.\)$/\1 \2 \3/;s/ [^ ]*_\([^ ]*\)_.*\..*$/ \1/' infile
Fri Nov 5 2010 00:28:17 BCGNOFJ
Fri Nov 5 2010 00:07:21 ADJCEL
Wed Nov 10 2010 00:01:13 BCTTEST3 X