Have a situation while extracting info

i have a text file which i am generating from few sqls.
format is like :

col1   col2      col3     col4           col5
1001     DONE     ABC     17-sep-14   12:02:05
1001     DONE     ABC     17-sep-14   12:02:05
1001     DONE     ABC      17-sep-14  12:02:55
1001     REDONE  ABC      17-sep-14   14:00:12
1001     REDONE  ABC      17-sep-14   14:00:15
1001     DONE     ABC      17-sep-14   19:02:10 
1001     DONE     ABC      17-sep-14   19:10:02
1001     REDONE  ABC      18-sep-14   13:00:00
1001     REDONE  ABC      18-sep-14   13:00:00
1001     REDONE  ABC      18-sep-14   13:00:00
1001     DONE     ABC      18-sep-14   16:03:16

i want to capture when DONE and REDONE was done , all occurrence's report , one for each time . output as below :

1001   DONE   ABC   17-sep-14    12:02:05
1001 REDONE  ABC   17-seo-14    14:00:12
1001 DONE     ABC    17-sep-14    19:10:02
1001 REDONE  ABC    18-seo-14    13:00:00
1001 DONE     ABC    18-sep-14    16:03:16

IMP note - date and time filed can have any value , no rule. 2nd column , DONE and REDONE can have any combination , we just need to find out what is present in 2nd column and fetch one occurrence from each time for DONE and REDONE .

Yes?
What have you tried so far?

something along these lines?

awk 'a!=$2{print;a=$2}' myFile

Not sure exactly what cols you are trying to match on.

Below is matching on columns 2 4 and 5 (adjust columns used as you need):

$ awk '!a[$2,$4,5]++' infile
col1   col2      col3     col4           col5
1001     DONE     ABC     17-sep-14   12:02:05
1001     REDONE  ABC      17-sep-14   14:00:12
1001     REDONE  ABC      18-sep-14   13:00:00
1001     DONE     ABC      18-sep-14   16:03:16

thanks guys , i am able to handle this situation . actually i used both methods from chubler_XL and vgersh99 .

let me explain in details what exactly i have and why i used both and still may be we can use one method instead of 2.

  1. i am extracting some data from 2 tables , that will be in a text file.
  2. now i want to sort/extract data from that text file based on my requirement .
  3. date may vary , and so status .
  4. for each num, Status can be any value , either DONE or REDONE. for same NUM, sequence can be DONE, DONE, REDONE, DONE
    but if i use awk 'a!=$2{print;a=$2}' myFile -> it returns DONE, REDONE , DONE only for same NAME column.
  5. if ENV column values are different but STATUS column value is same(DONE or REDONE), above logic will not work, it will return one row only. then i used this logic awk '!a[$2,$4,5]++' infile

how to get desired output in one command , right now i am using both logic in IF loop

NUM      Status      NAME                                ENV                        DATE       Time
100050596 DONE        PAB                              scotau1@csomrc1           23-SEP-14 06:49:33
 100050596 DONE        PAB                             scotau1@csomrc1           23-SEP-14 06:49:36
 100050596 DONE        PAB                              scotau1@csomrc1           23-SEP-14 06:49:38
 100050596 DONE        PAB                              scotau2@csomrc1           23-SEP-14 06:51:51
 100050596 DONE        PAB                              scotau2@csomrc1           23-SEP-14 06:51:53
 100050596 DONE        PAB                              scotau2@csomrc1           23-SEP-14 06:51:56
 100050596 DONE        PAB                              scotau3@csomrc1           23-SEP-14 06:53:05
 100050596 DONE        PAB                              scotau3@csomrc1           23-SEP-14 06:53:08
 100050596 DONE        PAB                              scotau3@csomrc1           23-SEP-14 06:53:10
100051482 DONE        SMO                              occbap1@hppabc1          24-SEP-14 20:58:42
 100051482 DONE        SMO                              occbap1@hppabc1          24-SEP-14 20:58:56
 100051482 DONE        SMO                              occbap1@hppabc1          24-SEP-14 20:59:09
 100051482 UNDONE      SMO                              occbap1@hppabc1          24-SEP-14 21:04:25
 100051482 UNDONE      SMO                              occbap1@hppabc1          24-SEP-14 21:04:38
 100051482 UNDONE      SMO                              occbap1@hppabc1          24-SEP-14 21:04:51
 100051482 DONE        SMO                             occbap1@hppabc1          24-SEP-14 21:05:52
 100051482 DONE        SMO                              occbap1@hppabc1          24-SEP-14 21:06:06
 100051482 DONE        SMO                              occbap1@hppabc1          24-SEP-14 21:06:19
 100051482 DONE        SMO                              occbap1@hppabc1          24-SEP-14 21:06:32
 100051482 DONE        SMO                              occbap1@hppabc1          24-SEP-14 21:07:25
 100051482 DONE        SMO                              occbap1@hppabc1          24-SEP-14 21:07:38
 100051482 DONE        SMO                              occbap1@hppabc1          24-SEP-14 21:07:51
 100051482 DONE        SMO                              occbap1@hppabc1          24-SEP-14 21:08:04

It sounds like you only want to match on columns 2 and 4 correct?

Then for you new infile you can use:

$ awk '!a[$2,$4]++' infile
NUM  Status      NAME                             ENV                              DATE       Time
100050596 DONE        PAB                              scotau1@csomrc1           23-SEP-14 06:49:33
 100050596 DONE        PAB                              scotau2@csomrc1           23-SEP-14 06:51:51
 100050596 DONE        PAB                              scotau3@csomrc1           23-SEP-14 06:53:05
100051482 DONE        SMO                              occbap1@hppabc1          24-SEP-14 20:58:42
 100051482 UNDONE      SMO                              occbap1@hppabc1          24-SEP-14 21:04:25

my earlier logic in IF loop was missing some info which is coming now using your new command but still few info is missing , let me tell you that:
logic which i am using [file.lst is having all the extracted data from DB tables]

for i in `cat file.lst|awk '{print $1}'|sort|uniq`
do
   if [[ `cat file.lst |grep $i |awk '{print $2}'|grep UNDONE |wc -l` -gt "0" ]]
                then
                cat file.lst |grep $i |awk 'a!=$2{print;a=$2}'
                else
                cat file.lst |grep $i |awk '!a[$2,$4,5]++'
                fi

 done >> file.tmp

using this code , i was not getting below type of info, which i am getting now using your latest command:
earlier

 100050910 DONE        MRC                              taumrc1@taumrct1          08-SEP-14 09:42:16
 100050910 UNDONE      MRC                              taumrc1@taumrct1          09-SEP-14 18:05:26

now [which is correct because 4th column, ENV is having different env names]

 100050910 DONE        MRC                              taumrc1@taumrct1          08-SEP-14 09:42:16
 100050910 DONE        MRC                              taumrc2@taumrct1          08-SEP-14 09:44:06
 100050910 DONE        MRC                              taumrc3@taumrct1          08-SEP-14 09:46:15
 100050910 UNDONE      MRC                              taumrc1@taumrct1          09-SEP-14 18:05:26

But one info was missed from new logic :
earlier [which is correct]

 100051336 DEPLOYED        SMO                              occbap1@hppabc1          19-SEP-14 12:35:31
 100051336 UNDEPLOYED      SMO                              occbap1@hppabc1          19-SEP-14 12:40:01
 100051336 DEPLOYED        SMO                              occbap1@hppabc1          19-SEP-14 12:42:24
 100051336 UNDEPLOYED      SMO                              occbap1@hppabc1          19-SEP-14 12:46:15
 100051336 DEPLOYED        SMO                              occbap1@hppabc1          19-SEP-14 12:50:32
 100051336 UNDEPLOYED      SMO                              occbap1@hppabc1          19-SEP-14 12:55:53
 100051336 DEPLOYED        SMO                              occbap1@hppabc1          19-SEP-14 12:58:32
 100051336 UNDEPLOYED      SMO                              occbap1@hppabc1          19-SEP-14 13:13:15

now

 100051336 DEPLOYED        SMO                              occbap1@hppabc1          19-SEP-14 12:35:31
 100051336 UNDEPLOYED      SMO                              occbap1@hppabc1          19-SEP-14 12:40:01

---------- Post updated at 09:56 PM ---------- Previous update was at 01:32 PM ----------

any one ?