Limiting file size of a directory

Hi There,
Can anyone help me with commands on below problem-
In a directory, file size in that should be maximum 30MB. Any file more than 30MB should be automatically removed from the folder.

Hello,

In order to provide a potentially useful answer to this question, we'd first have to know more about the environment in question. Please let us know, at a minimum, the following information:

  • The operating system (e.g. Linux, Solaris, etc)
  • The version and/or distribution (e.g. Solaris 11, Ubuntu 20.04 LTS, etc)
  • The hardware architecture (e.g. x86_64, SPARC, etc)
  • The shell you're using (bash, ksh, etc)
  • The nature of the storage on which the directory in question is located (e.g. local storage on an ext4 filesystem, remote storage on an NFS filesystem provided by a SAN or NAS appliance, etc)

Once we have the above information, we will have a better idea of the types of solution that may be available to you in this particular situation.

1 Like

Hi There,
Below information would be helpful-
. O.S - Linux
. Version- rhel 7.9
. Hardware architecture: x86_64
. Bash shell
. Nature of storage- Local

Hello,

OK, thanks. Now, some follow-up questions. Are all the users who will be writing to this directory members of the same UNIX group, or indeed will it always be the same UNIX user ? And should this 30MB quota be the total limit of all data that can be stored in the directory, or is it rather that you want no single file to possibly be larger than 30MB ? And lastly, how are the files being stored (e.g. directly created by an application, transferred via FTP or SCP, etc) ?

Hi,
Comments are as following-
. It will be the same unix user.
. this 30MB quota be the total limit of all data that can be stored in the directory.
. Files created by an application in that folder

Which files should be deleted? Oldest first? Largest first? Other criteria?

Largest first should be removed from this directory

It is probably not practical to detect the moment when the total size of the files in a directory exceeds a given size. Even if you observed a file growing excessively, the process writing it should not just be terminated, because it might be doing something significant apart from writing the problem file.

I would suggest an independent script that executes periodically (under cron), the period depending on how often new files are created (or old ones appended to), and how long you can tolerate the directory being temporarily over-sized. The desired size and directory name should be arguments in the cron command entry, not embedded in the script.

Assuming there are no sub-directories, running the command du -s -k in the target directory will return its total size in kilobytes.

If the size exceeds your limit of 30720, then this command will list the sizes (in bytes, largest first) and the names of all the files (colon separated).

stat --format='%s:%n' * | sort -n -r

You can iterate through that list, removing the named file, subtracting each file size from the total, and terminate as soon as the residue falls below 30720. Remember to scale the size from du by 1024 to be in the same units as given by stat.

Edit: not that smart. the du size counts whole blocks, not the actual file sizes, but it is still worth using for a fast initial check. For the countdown to the exact size, we need to read the stat results into an array, sum them exactly in one pass, and then remove as needed in a second pass.

Sparse files would also be an issue (if present). The du check will count actual used blocks in total, but stat will tell you the apparent size of each file.

1 Like

@AnujKumar1

a bit late to this thread, based on the posters comments earlier in the conversation ...

Hi,
Comments are as following-
. It will be the same unix user.
. this 30MB quota be the total limit of all data that can be stored in the directory.
. Files created by an application in that folder

One [potential] alternative would be to create a virtual volume that the user can access - gives the benefit of setting the max size of the volume , hence maximum file size ....

for example (partial, but you get the idea) ....

$
$ dd if=/dev/zero of=user-X-disk count=100000
40960+0 records in
40960+0 records out
51200000 bytes (51 MB, 49 MiB) copied, 0.289933 s, 177 MB/s
$ ll
total 20488
drwxrwxr-x  2 er er     4096 Mar 10 23:38 ./
drwxrwxr-x 42 er er     4096 Mar 10 23:37 ../
-rw-rw-r--  1 er er 51200000 Mar 10 23:58 user-X-disk
$
$ /sbin/mkfs -t ext3 user-X-disk
mke2fs 1.44.1 (24-Mar-2018)
Discarding device blocks: done                            
Creating filesystem with 50000 1k blocks and 12544 inodes
Filesystem UUID: 982b1b07-54af-4bfb-bed6-30078840c59d
Superblock backups stored on blocks: 
	8193, 24577, 40961

Allocating group tables: done                            
Writing inode tables: done                            
Creating journal (4096 blocks): done
Writing superblocks and filesystem accounting information: done
$
$ file user-X-disk 
user-X-disk: Linux rev 1.0 ext3 filesystem data, UUID=982b1b07-54af-4bfb-bed6-30078840c59d (large files)
$
$ # etc ...

the volume can then mounted/permissions-set/user-profile-modded etc ...

That's probably faster than creating another volume with lvm.

ZFS is good for this task: define a sub volume with a fixed size. And the size can be changed with one command at any time.
BTRFS might have a similar capability.