Finding the Latest record

Dear All,

I have getting data as follows, the second field signifies table name and last one is time stamp. I have return always latest record based on time stamp. Could you please help me ?

I/P

5460-DataAcqu-MF-Extract|bill_cycle_code|SW25|?|?|?|FEDWM708|SALES|MF-Extract/ftp|?|Refresh|M|1st00:15|acqxmain|sw25_20060901_000521|fedex5|184360|2006-09-0100:10:51
5460-DataAcqu-MF-Extract|bill_cycle_code|SW25|?|?|?|FEDWM708|BILCYCLE|MF-Extract/ftp|?|Refresh|M|1st00:15|acqxmain|SW25_20140401_000520|fedex5|184360|2014-04-0105:32:58
5460-DataAcqu-MF-Extract|cash_only_reason_code|SW21|?|?|?|FEDWM704|SALES|MF-Extract/ftp|?|Refresh|D|1st00:15|acqxmain|sw21_20060901_000559|fedex5|184360|2006-09-0100:11:17
5460-DataAcqu-MF-Extract|cash_only_reason_code|SW21|?|?|?|FEDWM704|CASHONLY|MF-Extract/ftp|?|Refresh|D|1st00:15|acqxmain|sw21_20140401_000549|fedex5|184360|2014-04-0105:33:09
5460-DataAcqu-MF-Extract|ctry_x_lang|CTLG|?|?|?|FEDWM056|FEDW056|MF-Extract/ftp|?|Refresh|M|1stat09:00|acqxmain|CTLG_20140401_090032|fedex5|490350|2014-04-0114:06:13
5460-DataAcqu-MF-Extract|edw_domain|DOMN|?|?|?|FEDWD041|FEDW041|MF-Extract/ftp|?|Refresh|W|Sun9:00|acqxmain|DOMN_fix_20031005|fedex5;fedex7|473901|2006-09-1709:08:36
5460-DataAcqu-MF-Extract|edw_domain|DOMN|?|?|?|FEDWW120|FEDW078|MF-Extract/ftp|?|Refresh|W|Sun9:00|acqxmain|domn_20140415_134516|fedex5;fedex7|473901|2014-04-1519:49:19
5460-DataAcqu-MF-Extract|sic_code|SW23|?|?|?|FEDWM706|FEDW255|MF-Extract/ftp|?|Refresh|M|1st00:15|acqxmain|SW23_20140401_000515|fedex5|184360|2014-04-0105:32:07
5460-DataAcqu-MF-Extract|sic_code|SW23|?|?|?|FEDWM706|SALES|MF-Extract/ftp|?|Refresh|M|1st00:15|acqxmain|SW23_20060901_000521|fedex5|184360|2006-09-0100:10:11

O/P

5460-DataAcqu-MF-Extract|bill_cycle_code|SW25|?|?|?|FEDWM708|BILCYCLE|MF-Extract/ftp|?|Refresh|M|1st00:15|acqxmain|SW25_20140401_000520|fedex5|184360|2014-04-0105:32:58
5460-DataAcqu-MF-Extract|cash_only_reason_code|SW21|?|?|?|FEDWM704|CASHONLY|MF-Extract/ftp|?|Refresh|D|1st00:15|acqxmain|sw21_20140401_000549|fedex5|184360|2014-04-0105:33:09
5460-DataAcqu-MF-Extract|ctry_x_lang|CTLG|?|?|?|FEDWM056|FEDW056|MF-Extract/ftp|?|Refresh|M|1stat09:00|acqxmain|CTLG_20140401_090032|fedex5|490350|2014-04-0114:06:13
5460-DataAcqu-MF-Extract|edw_domain|DOMN|?|?|?|FEDWW120|FEDW078|MF-Extract/ftp|?|Refresh|W|Sun9:00|acqxmain|domn_20140415_134516|fedex5;fedex7|473901|2014-04-1519:49:19
5460-DataAcqu-MF-Extract|sic_code|SW23|?|?|?|FEDWM706|FEDW255|MF-Extract/ftp|?|Refresh|M|1st00:15|acqxmain|SW23_20140401_000515|fedex5|184360|2014-04-0105:32:07

Assuming that all of the records for a given table are on adjacent lines in your input file (as in your sample input), the following seems to do what you want:

awk -F"|" '
function prtable() {
	if(table_name != "") print latest
	latest = $0
	table_name = $2
	time_stamp = $NF
}
$2 != table_name {
	prtable()
	next
}
$NF > time_stamp {
	latest = $0
	time_stamp = $NF
}
END {	prtable()
}' input

If you want to use this on a Solaris/SunOS system, use /usr/xpg4/bin/awk , /usr/xpg6/bin/awk , or nawk instead of /usr/bin/awk .

1 Like