Need help in arranging data

I have a file with user activity and need to display only the start and end timestamp of the activity. I don't know how can we write an logic for this please help me in a bettr way to work on it

 
User                         Activity_log
 -----------------------------------
 Auto_Generated      2014-08-17 15:25:03.333
 Auto_Generated      2014-08-17 15:25:03.333
 Auto_Generated      2014-08-17 15:25:03.348
 Auto_Generated      2014-08-17 15:25:03.348
 Auto_Generated      2014-08-17 15:25:03.365
 Auto_Generated      2014-08-17 15:25:03.365
 Auto_Generated      2014-08-17 15:25:03.379
 Auto_Generated      2014-08-17 15:25:03.379
 Jack   2014-08-17 15:25:50.273
 Jack   2014-08-17 15:25:50.313
 Jack   2014-08-17 15:25:54.433
 Jack   2014-08-17 15:25:54.91
 Jack   2014-08-17 15:25:54.922
 Jack   2014-08-17 15:25:54.938
 Jack   2014-08-17 15:25:54.952
 Jack   2014-08-17 15:25:54.982
 Jack   2014-08-17 15:25:55.022
 Jack   2014-08-17 15:26:02.26
 Jack   2014-08-17 15:26:02.28
 Kate   2014-08-19 11:12:31.206
 Kate   2014-08-19 11:12:31.246
 Kate   2014-08-19 11:12:31.337
 Kate   2014-08-19 11:12:31.386
 Kate   2014-08-19 11:12:31.446
 Kate   2014-08-19 11:43:43.795
 Kate   2014-08-19 11:43:59.876
 Kate   2014-08-19 11:43:59.888
 Kate   2014-08-19 11:43:59.916
 Kate   2014-08-19 11:43:59.941
 Tom    2014-08-19 12:56:31.306
 Tom    2014-08-19 12:56:37.892
 Tom    2014-08-19 12:56:38.779
 Tom    2014-08-19 12:56:38.798
 Tom    2014-08-19 12:56:38.82
 System      2014-08-24 07:00:35.574
 System      2014-08-24 07:00:35.577
 System      2014-08-24 07:00:35.578
 System      2014-08-24 07:00:35.587
 System      2014-08-24 07:00:35.595
 System      2014-08-24 07:00:35.618
 Kate   2014-12-23 08:22:17.949
 Kate   2014-12-23 08:22:17.958
 Kate   2014-12-23 08:22:17.989
 Kate   2014-12-23 08:22:18.013
Rita 2014-08-24 11:06:38.593
Rita 2014-08-24 11:06:44.816
Rita 2014-08-24 11:06:44.848
Rita 2014-08-24 11:06:44.862
Rita 2014-08-24 11:06:44.888
 
Required Output 
Jack   2014-08-17 15:25:50.273   2014-08-17 15:26:02.28
Kate   2014-08-19 11:12:31.206   2014-08-19 11:43:59.941 
Tom    2014-08-19 12:56:31.306   2014-08-19 12:56:38.82
Kate    2014-12-23 08:22:17.949   2014-12-23 08:22:18.013
Rita   2014-08-24 11:06:38.593   2014-08-24 11:06:44.888 

Try

awk 'NF==3 && $1!=p{if(p)print pl; printf $0 OFS}NF==3{p=$1; pl=$2 FS $3}END{print pl}' infile

to skip line with string Auto_Generated try this

awk '/Auto_Generated/{next}NF==3 && $1!=p{if(p)print pl; printf $0 OFS}NF==3{p=$1; pl=$2 FS $3}END{print pl}' infile

Tried with a simple file but it didn't return any data

 
> cat sam.out
Avinash 2015-01-02-15.25.28.993000
Avinash 2015-01-02-15.25.29.018000
Avinash 2015-01-02-15.25.29.376000
Avinash 2015-01-02-15.25.29.388000
 
> awk 'NF==3 && $1!=p{if(p)print pl; printf $0 OFS}NF==3{p=$1; pl=$2 FS $3}END{print pl}' sam.out

No surprise, data format in post #1 and post #3 are not same

Try this

 awk '$1!=p{if(p)print pl; printf $0 OFS}{p=$1; pl=$2}END{print pl}' infile
1 Like

Slight variation to Akshay's approach:

awk 'NR<3{next} $1!=p{if(p)print s, r; s=$0; p=$1} {r=$2 OFS $3} END{print s, r}'  file

--
Note, when using printf it is best to not leave printf's format field open, so printf "%s ", $0 , or printf "%s", $0 OFS rather than printf $0 OFS

Lazydev, is the data ordered by user and by time? Or was that just the sample you provided?

Variation to Scrutinizer approach for ordered by time, but not by user (aka, a logfile) data, an awkscript (with whitespace!) for your consideration:

BEGIN {
  skip["Auto_Generated"];
  skip["System"];
}

$1 in skip { next; }
           { et[$1] = $2; }
$1 in user { next; }
           { user[$1]; st[$1] = $2; }

END        { for (u in user) print u, st, et; }

If the data has the date and time in two fields, change $2 to $2 FS $3 or to $2 OFS $3 .