Finding & Moving Oldest File by Parsing/Sorting Date Info in File Names

I'm trying to write a script that will look in an /exports folder for the oldest export file and move it to a /staging folder. "Oldest" in this case is actually determined by date information embedded in the file names themselves.

Also, the script should only move a file from /exports to /staging if staging is empty; if there's already a file in /staging then the script would exit without doing anything.

Here's an example of what the folders and files might look like:

/exports
job1_Friday_05302008.xml
job1_Monday_05262008.xml
job1_Monday_06022008.xml
job1_Thursday_05292008.xml
job1_Tuesday_06032008.xml
job1_Wednesday_05282008.xml

/staging

In this case, because /staging is empty, the script should move /exports/job1_Monday_05262008.xml to /staging. The next time the script runs it should exit if /exports still contains job1_Monday_05262008.xml, otherwise it should move job1_Wednesday_05282008.

Also, the "job1" prefix is static.

I've got a script that uses ls and cat to loop through file names and parse out the date information but i'm getting bogged down with how to handle the results and i'm starting to think i might be going down the wrong path...

Any suggestions would be appreciated

Assuming the file list is in a file already. (You could pipe the ls directly to the rest of the commands, if you were so inclined.)

> cat exp_contents 
job1_Friday_05302008.xml
job1_Monday_05262008.xml
job1_Monday_06022008.xml
job1_Thursday_05292008.xml
job1_Tuesday_06032008.xml
job1_Wednesday_05282008.xml

Now, try out the following command:

> cat exp_contents | tr "_" " " | awk '{print substr($3,5,4)"."substr($3,1,4)"|"$1"_"$2"_"$3}' | sort
2008.0526|job1_Monday_05262008.xml
2008.0528|job1_Wednesday_05282008.xml
2008.0529|job1_Thursday_05292008.xml
2008.0530|job1_Friday_05302008.xml
2008.0602|job1_Monday_06022008.xml
2008.0603|job1_Tuesday_06032008.xml

You can append to this a head -1 to get the top line.
You can also do a cut -d"|" -f2 to get the filename back for the oldest entry.

If this works, then the rest of the logic will flow much smoother.

Look like a homework.
Please read Simple rules of the UNIX.COM forums: before posting, especially 5 and 6.

No, it's a real-world problem :slight_smile:

It's just written out like a homework assignment because i've done enough technical work to know that when you ask for help, clarity is king.

I'm working on an interface that gets daily xml exports from an ftp site and synchronizes them with Siebel CRM. I've got Oracle Fusion Middleware to:

A) transfer files from the FTP site to an /exports folder

C) route files from a /staging folder to an Siebel inbound web service.

But I need a Step B) to meter out the appropriate files from /exports to /staging in the right order (first in, first out). I'm trying to do this with shell scripting but this is a bit outside my expertise, as you can tell... :slight_smile:

From my earlier post, you should be able to determine the oldest file. Are you there yet?

You can set a variable such like:

old_file=$(cat exp_contents | tr "_" " " | awk '{print substr($3,5,4)"."substr($3,1,4)"|"$1"_"$2"_"$3}' | sort | head -1 | cut -d"|" -f2)
echo $old_file

If you are that far, then onto the 'easier' part of the requirements.

Yes, it's working perfectly!

Here's the full script i'm using in case it's helpful for anyone else.

It has an extra line at the top to purge files in an /archive folder that are > 7 days old.

And the folder names and locations are changed a bit from what they were in my original post.

But it seems to be working nicely.

Thanks again for the help!

#!/bin/ksh
find ../archive -name *.xml -type f -mtime +7 -exec rm {} \;
PROCESSING=$(ls ../processing | head -1)
if [[ -z $PROCESSING ]];then
   INCOMING=$(ls ../incoming/*.xml | cat | tr "_" " " | awk '{print substr($3,5,4)"."substr ($3,1,4)"|"$1"_"$2"_"$3}' | sort | head -1 | cut -d"|" -f2)
   if [[ -n $INCOMING ]];then
      mv $INCOMING ../processing
   fi
fi

You can use the file time, if staging is empty move the oldest xml file from exports to staging.

[ "$(ls -A staging)" ] || mv /exports/$(ls -t /exports/*.xml | tail -1) /staging/