Find and Tar a Folder

Hi all,

I have created a function that looks for a folder in a particular directory, checks the date it was last modified and if its old then compress it.
This works fine for files using gzip. However for folders I had to use tar. This is my function:

 
compressOldFolder()
{
  # $1 is the directory to search
  # $2 is the number of days a file must be old by, before it is compressed
 
  # Compress folders that:
  #    Are in/under the specified ($1) directory
  #    Were modified more than ($2) days ago
 
  find $1 -mtime +$2 -type d \( \! -name '*\.gz' \) -exec tar -cvpf {} $1 \;
}

the equivalent funtion for compressing old files uses

 
find $1 -mtime +$2 -type f \( \! -name '*\.gz' \) -exec gzip -f {} \;

which works fine...

the problem with the tar one is I get an error message saying "Cannot write to a directory"

If I replace "{}" with for eg " Archive1 " then it works fine, however what I want it to do is find the folder and compress it, not create a new separate archive

for example if I have the following directories
Folder1
--Nov2010
--Dec2010

I want the function to compress those two folders to
Folder1
--Nov2010.tar.gz
--Dec2010.tar.gz

(Nov2010 and Dec2010 are located inside Folder1 hence the "--")

any ideas?

thanks for your time!

It's simply a matter of parameters, gzip takes a filename and zips it up replacing the original file with the zipped file and adding the .gz exception.

The tar command on the other hand needs to be told what to save the archive as. The paramter supplied after -f is the archive name.

1 Like

Your problem is that you are trying to create a file (archive) with the same name as directory that you are compressing. Remember though, that directories in UNIX are just files, and you cannot create two files with the same name. So basically, you have a name conflict there...

Workaround would be to name archives with some extension like:

find . -type d -name .... -exec tar -cvpf {}.zipped {} \;

and then delete the dirs and rename the archives. The original dirs can be removed in the same tar command with --remove-files switch, if your version of tar supports it, like:

find . -type d -name .... -exec tar --remove-files -cvpf {}.zipped {} \;

Then you just need to rename the .zipped archives.

=== EDIT: ===

After rereading your post, i realize all you need is to set the extension. If you invoke your function as
compressOldFolder Folder1

in your example; the find command should be:

find $1 -mtime +$2 ...<other find options>... -exec tar --remove-files -cvpzf {}.tar.gz {} \;

Because doing

-exec tar -cvpf {} $1

will try to compress the Folder1 and name the archive Nov2010 (but a file named Nov2010 already exists -- it's the orig dir)

You also want the -z option of tar to create gzipped archive

1 Like

That's because you're telling it to write to a tar file named 'Nov2010' (tar -f <filename>) - tar doesn't automatically add a suffix.

tar -cvpzf {}.tar.gz

might work

1 Like

Hi all thank you for your replies

I understand what the problem is
However both ways create an archive named "{}.tar.gz" or "{}.zipped"
also -z is not recognized as a perameter ( I use AIX )
I want it to keep the same name so I can have for example

Nov2010 and Nov2010.zipped

the problem is how can I get the function to use the foldername for creating the archive?

iirc AIX tar has a zip option, but I forget what it was...

A simple way could be:

#!/bin/ksh

for i in $( find $1 -type d -mtime +$2 -print )
do
        tar cpf "$i.tar"  "$i"
        gzip "$i.tar"
done
1 Like

Thanks for your reply Carlom

it seems to work fine, though it now causes a new problem

-print (or even just find) returns the folders as well as the parent path
for examble if I have Oct2010 and Nov2010 in Folder1 the find will return

 
home/stuff/Folder1/
home/stuff/Folder1/Oct2010
home/stuff/Folder1/Nov2010

as a result I will then have 3 compressed files

 
.tar.gz
Oct2010.tar.gz
Nov2010.tar.gz

now I can easily remove the .tar.gz
but if I try to remove the original folders Oct2010 and Nov2010 and keep only the gz files by modifing my code as below

 
compressOldFolder()
{
  # $1 is the directory to search
  # $2 is the number of days a file must be old by, before it is compressed
 
  # Compress folders that:
  #    Are in/under the specified ($1) directory
  #    Were modified more than ($2) days ago
 
  for i in $( find $1 -type d \( \! -name '*\.gz' \) -print )
  do
          tar cpf "$i.tar"  "$i"
          gzip "$i.tar"
          rm -r $i
 
  done  
 
  mv $1.tar.gz temp
  rm $1temp
 
}

that will aslo remove the parent folder.

I know it is relativly easy to do an ls and grep the folder names then remove them, but I want to keep everything inside the function and keep it simple avoiding complexity. Is there a way to prevent find from returning the current path? (could not find something in man pages)

Try

find $1 -type d \( \! -name '*\.gz' \) \( \! -samefile $1 \) -print
1 Like

sadly -samefile is not recognized on AIX

*&^�$%& IBM! :slight_smile:

How about:

find $1 -type d \( \! -name '*\.gz' \) -print | grep -v ^$1$
1 Like

:smiley: that did the trick... thank you very much CarloM