Hi
I have the following columns in my input file which is sorted on ColA
colA colB colC colD colE
The data is such that for a single colA it can have multiple values of other columns, for eg the data can be
colA colB colC colD colE
1 1 1 1 1
1 2 1 1 1
1 3 1 1 1
1 4 1 1 1
1 5 1 1 1
2 1 1 1 1
2 2 1 1 1
2 3 1 1 1
2 4 1 1 1
2 5 1 1 1
I need to check the records based on ColA, for value of ColA(1) I need to check the other columns for the next value of colA(2) and so on. If other columns (col B,C,D,E) match exactly for all the colA values then I need to pass only on record to output file.
I need to check whether all the records with value of ColA as 1 matched with the all the records with Value of ColA as value other than 1
Basically, I need to compare the current and previous complete records (consisting of 5 columns). If they match, then I have to pass only one such occurrence of the record.
Consider the second scenario
colA colB colC colD colE
1 1 1 1 1
1 2 1 1 1
1 3 1 1 1
1 4 1 1 1
1 5 1 1 1
2 1 1 1 1
2 2 1 1 1
2 3 1 1 1
In this case even though the columns B, C, D, E match for different values of ColA but not all the rows of ColA with value 1 matched with the colA with Value 2, so in this case I need to output file 2 records each for different values of ColA.
Any idea how do I go about this