Hi,
I have a file which contains few columns and the first column has the file names, and I would like to identify the missing file sequence number form the file and would copy to another file. My files has data in below format.
APKRISPSIN320131231201319_0983,1,54,125,
APKRISPSIN320131231233522_1879,1,27,118,
APKRISPSIN320131231233721_1887,1,0,6,
APKRISPSRM320140101025823_2478,1,0,0,
APKRISPSRM320140101025919_2479,4,0,0,
APKRISPSRM320140101025924_2480,6,7370,13981,
ASKRISPSIN120131231235048_8870,17,10661,72542,
ASKRISPSIN120131231235056_8871,11,7336,46301,
ASKRISPSIN120131231235150_8872,8,214,608,
ASKRISPSIN120131231235246_8873,4,104,398,
ASKRISPSIN120131231235251_8874,13,8139,34388,
ASKRISPSRM120140101095451_9545,4,5220,2344,
ASKRISPSRM120140101095457_9546,2,512,0,
ASKRISPSRM120140101095554_9547,2,39911,4343,
ASKRISPSRM120140101095652_9548,1,9341,5407,
DLKRISPSIN120131231235048_8870,32,194,180,
DLKRISPSIN120131231235056_8871,36,57,17,
DLKRISPSIN120131231235150_8872,25,157,151,
DLKRISPSIN120131231235246_8873,43,251,225,
DLKRISPSIN120131231235251_8874,48,381,212,
DLKRISPSIN120131231235347_8875,48,2035,568,
DLKRISPSIN120131231235354_8876,34,310,301,
DLKRISPSRM120140101132256_9879,28,188146,447781,
DLKRISPSRM120140101132353_9880,52,1191351,4888375,
DLKRISPSRM120140101132359_9881,41,494989,3373621,
DLKRISPSRM120140101132455_9882,47,1187497,2552643,
Let me explain the file name format.
First 2 character as sytem code, i.e
DL, AP, AS
,
KRIS
as constant, next 4 character are also constand as input and output
PSIN & PSRM
, then time stamd
YYYYMMDDHHMMSS
and then litiral string
_
file sequence number
0000 to 9999
.
Each PSIN & PSRM has file sequence starting from 0000 and ends with 9999, if any duplicate found then the time stamp should be differ.
Now I want to identify missing sequence out of this in below format, my file may contain more than million record.
DLKRISPSRM_Missing Sequence
Request your help in resolving the same.