I have tried the following code and with that i couldnt achieve what i want.
#!/usr/bin/bash
find ./ -type f \( -iname "*.xml" \) | sort -n > fileList
sed -i '/\.\/fileList/d' fileList
NAMEOFTHISFILE=$(echo $0|sed -e 's/[]\/()$*.^|[]/\\&/g')
sed -i "/$NAMEOFTHISFILE/d" fileList
cp fileList auxFileList
while read FILENAME
do
sed -i '1d' auxFileList
#echo "Comparing $FILENAME with :"
#Read the aux file and compare current file with every other element in the file
while read COMPFILENAME
do
RETURN=$(diff $FILENAME $COMPFILENAME)
if [ "$RETURN" == "" ]
then
cat $FILENAME | awk ' BEGIN { FS="_" } { printf( "%03d\n",$2) }' | sort | awk ' { printf( "data_%d_box\n", $1) }'
#echo "$FILENAME AND $COMPFILENAME are identical"
#rm -r $FILENAME
fi
#echo " $COMPFILENAME"
done<auxFileList
done<fileList
rm fileList auxFileList &>/dev/null
printf '\n\n'
this code selecting all the files initially. I have to amend my code in such a way that only recent modified filename patterns for example
File 1: AAA_555_0000
File 2: AAAA_123_123
File 3: AAAA_452_452 [latest]
File 4: BBB_555_0000
File 5: BBB_555_555
File 6: BBB_999_999 [latest]
File 7: CCC_555_0000
File 8: CCC_000_000
File 9: CCC_000_111 [latest]
Script has to pick latest file in all the filename patterns in the folder and it should compare and delete the duplicates.
Appreciate if you can help me with this logic.
Thanks much!