I need some help with the logic and syntax for a shell script (ksh) that will search a directory and look for similar files and save only the last two versions. The version number is in the file name. However, the files are of varying name lengths and may have 1 or many files, with no limit to the number of files. I am not sure that using the find command for date/timestamp is a good idea because these are adhoc files that get created.
The result should be:
apps_V01.xml
betarelease_V02.xml
betarelease_V03.xml
test_V03.xml
test_V04.xml
testing_V99.xml
testing_V100.xml
I thought about putting the listing into a text file and then substringing the names using awk, but don't know how I would handle the number of similar files. My thought is to output the listing to a file, read the file until it reaches a new file creating an array of files and then save the last two in the array. Then read for the next set of files. But again, not sure how to do that. A problem also occurs when I only have 1 version of a file. I welcome any sed, awk or ksh commands. I don't know enough about Perl or any other language in order to do this. Some help would be greatly appreciated. I have searched more than 300 postings and not coming up with anything fairly close to what I need to accomplish.
That is correct. I want to keep only the last two versions of each file and delete all others. My sample output shows the results I would get if the script works correctly. That part would not be hard. If I could get all the other files in a text file, then I could run a "for loop" that would delete all the files from the directory that exist in the text. That I can do.
Hi, I thought about it, and even though I may do it the hard way, I think it gets the job done... and to be on the safe side, make working copies, this one moves the wanted files into their own directory.
I tried it and it works on my machine
#!/bin/bash
#Assuming Your current working directory is ok to work in and the files
#are located in the directory verfiles (create working copies there)
#First sort, if the naming is consistent, that should be a good start
ls -1 verfiles | sort > verfiles.srt
#split the filenames on each "firstname" so they will have their own file
prefix=""
while read filename; do
#Determine the "prefix"
curprefix=$(expr match "$filename" '\(^[a-Z]*_\)')
#Has it been read before?
if [ x$curprefix != x$prefix ] ; then
#We have a new "firstname"
prefix=$curprefix
fi
echo $filename >> $prefix.lst
done < verfiles.srt
#cleanup
rm verfiles.srt
#Where to keep the "last two" of each...
mkdir saved-verfiles
#Now for each .lst file, manipulate it a little for easier numerical sorting
for x in *.lst; do
#And this next bit is VERY lazy of me, but as I said,
#I ASSUME that the naming is consistent ;)
myarr=($(tr V " " < $x | tr . " " |sort -n -k2 | tail -2))
mv verfiles/${myarr[0]}V${myarr[1]}.${myarr[2]} saved-verfiles
mv verfiles/${myarr[3]}V${myarr[4]}.${myarr[5]} saved-verfiles
done
Thanks for the responses. All were great ideas. Summer_cherry had the most compact code and easiest to manipulate for an inexperienced shell programmer like me. However, I did like all your responses and will try to learn from each of your approaches. Thank you all very much.