I have file which is almost look like below
MMIT
MMIT
VAR_1D_DATA_TYPE
MMIT
VAR_1D_DATA_TYPE
19-03-2012
MMIT
VAR_1D_DATA_TYPE
16-03-2012
MMIT
VAR_1D_DATA_TYPE
15-03-2012 MMIT
VAR_10D_DATA_TYPE
MMIT
VAR_10D_DATA_TYPE
19-03-2012
MMIT
VAR_10D_DATA_TYPE
16-03-2012
MMIT
VAR_10D_DATA_TYPE
15-03-2012
MMIT
VAR_10D_DATA_TYPE
14-03-2012
MMIT
STRESSED_VAR_1D_DATA_TYPE
MMIT
STRESSED_VAR_1D_DATA_TYPE
19-03-2012
MMIT
STRESSED_VAR_1D_DATA_TYPE
16-03-2012
MMIT
STRESSED_VAR_1D_DATA_TYPE
15-03-2012
MMIT
STRESSED_VAR_1D_DATA_TYPE
14-03-2012
ISS
ISS
CB_VAR_DATA_TYPE
ISS
CB_VAR_DATA_TYPE
19-03-2012
ISS
CB_VAR_DATA_TYPE
16-03-2012
Now it will check for very first two occurance of the same words, like in this example the same occurance of the words are MMIT and ISS , so it will remove one of each words so the new file will look like below
MMIT -removed
MMIT
VAR_1D_DATA_TYPE
MMIT
VAR_1D_DATA_TYPE
19-03-2012
MMIT
VAR_1D_DATA_TYPE
16-03-2012
MMIT
VAR_1D_DATA_TYPE
15-03-2012 MMIT
VAR_10D_DATA_TYPE
MMIT
VAR_10D_DATA_TYPE
19-03-2012
MMIT
VAR_10D_DATA_TYPE
16-03-2012
MMIT
VAR_10D_DATA_TYPE
15-03-2012
MMIT
VAR_10D_DATA_TYPE
14-03-2012
MMIT
STRESSED_VAR_1D_DATA_TYPE
MMIT
STRESSED_VAR_1D_DATA_TYPE
19-03-2012
MMIT
STRESSED_VAR_1D_DATA_TYPE
16-03-2012
MMIT
STRESSED_VAR_1D_DATA_TYPE
15-03-2012
MMIT
STRESSED_VAR_1D_DATA_TYPE
14-03-2012
ISS --removed
ISS
CB_VAR_DATA_TYPE
ISS
CB_VAR_DATA_TYPE
19-03-2012
ISS
CB_VAR_DATA_TYPE
16-03-2012
now it will check very similar occurance of first and second words like in our example
MMIT
VAR_1D_DATA_TYPE
MMIT
VAR_1D_DATA_TYPE
and
MMIT
STRESSED_VAR_1D_DATA_TYPE
MMIT
STRESSED_VAR_1D_DATA_TYPE
and
ISS
CB_VAR_DATA_TYPE
ISS
CB_VAR_DATA_TYPE
so it will remove one pair and keep other . So finally after removing these words output should look like as below
MMIT
VAR_1D_DATA_TYPE
19-03-2012
MMIT
VAR_1D_DATA_TYPE
16-03-2012
MMIT
VAR_1D_DATA_TYPE
15-03-2012 MMIT
VAR_10D_DATA_TYPE
MMIT
VAR_10D_DATA_TYPE
19-03-2012
MMIT
VAR_10D_DATA_TYPE
16-03-2012
MMIT
VAR_10D_DATA_TYPE
15-03-2012
MMIT
VAR_10D_DATA_TYPE
14-03-2012
MMIT
STRESSED_VAR_1D_DATA_TYPE
19-03-2012
MMIT
STRESSED_VAR_1D_DATA_TYPE
16-03-2012
MMIT
STRESSED_VAR_1D_DATA_TYPE
15-03-2012
MMIT
STRESSED_VAR_1D_DATA_TYPE
14-03-2012
ISS
CB_VAR_DATA_TYPE
19-03-2012
ISS
CB_VAR_DATA_TYPE
16-03-2012