Find Large Files Recursively From Specific Directory

Hi.

I found many scripts in the web of achieving this.

But I like to use this one

find /EDWH-DMT03 -xdev -size +10000 -exec ls -la {} \;|sort -n -k 5 > LARGE.rst

But the problem is, why it still list out files with 89 bytes as the output? Is there anything wrong with the command?

My UNIX environment is as below

HP-UX system1 B.11.31 U ia64 0189138652 unlimited-user license

Any help would be much appreciated.

Thank you.

Please show us the ls -la output for the file that has size 89 bytes. That find command shouldn't show you anything for files smaller than 512000 bytes. Unless you have a huge directory, it shouldn't make any difference, but please also try:

find /EDWH-DMT03 -xdev -size +10000 -exec ls -lad {} +|sort -n -k 5,5 > LARGE.rst
1 Like

Thanks Don.

This is some of the output:

-rw-r--r--   1 ftphsbb    ftpuser         89 Apr 24 21:40 CASS_voice_NON_IMS_20160424_61486.cdr.ext
-rw-r--r--   1 ftphsbb    ftpuser         89 Apr 25 23:20 CASS_voice_NON_IMS_20160425_61701.cdr.ext
-rw-r--r--   1 ftphsbb    ftpuser         89 Apr 26 05:20 CASS_voice_NON_IMS_20160426_61746.cdr.ext
-rw-r--r--   1 ftphsbb    ftpuser        127 Nov 11  2014 CASS_voice_NON_IMS_20141022_59770.cdr.ext
-rw-r--r--   1 ftphsbb    ftpuser        174 Nov 11  2014 CASS_voice_NON_IMS_20141021_59453.cdr.ext
-rw-r--r--   1 ftphsbb    ftpuser        200 Nov 11  2014 CASS_voice_NON_IMS_20141022_59764.cdr.ext
-rw-r--r--   1 ftphsbb    ftpuser        221 Dec  5  2013 CASS_voice_NON_IMS_20131205_02067.cdr.ext
-rw-r--r--   1 ftphsbb    ftpuser      21832 Sep 15  2015 CASS_voice_NON_IMS_20150915_16466.cdr.ext
-rw-r--r--   1 ftphsbb    ftpuser      21833 Dec 15  2014 CASS_voice_NON_IMS_20141215_68321.cdr.ext
-rw-r--r--   1 ftphsbb    ftpuser      21834 Jun 17  2014 CASS_voice_NON_IMS_20140617_41284.cdr.ext
-rw-r--r--   1 ftphsbb    ftpuser      21834 Nov 11  2014 CASS_voice_NON_IMS_20141001_57072.cdr.ext
-rw-r--r--   1 ftphsbb    ftpuser      21836 Nov 25  2013 CASS_voice_NON_IMS_20131125_99511.cdr.ext
-rw-r--r--   1 ftphsbb    ftpuser      21860 Nov 27  2013 CASS_voice_NON_IMS_20131127_00075.cdr.ext
-rw-r--r--   1 ftphsbb    ftpuser      21861 Jul 28  2014 CASS_voice_NON_IMS_20140728_47098.cdr.ext
-rw-r--r--   1 ftphsbb    ftpuser      37255 Jul  6  2015 CASS_voice_NON_IMS_20150706_01203.cdr.ext
-rw-r--r--   1 ftphsbb    ftpuser      37258 May 29  2015 CASS_voice_NON_IMS_20150529_94299.cdr.ext
-rw-r--r--   1 ftphsbb    ftpuser      37260 Nov 11  2014 CASS_voice_NON_IMS_20141021_59484.cdr.ext
-rw-r--r--   1 ftphsbb    ftpuser      37264 Nov  6 15:00 CASS_voice_NON_IMS_20151106_27007.cdr.ext
-rw-r--r--   1 ftphsbb    ftpuser      37272 Sep 15  2015 CASS_voice_NON_IMS_20150915_16414.cdr.ext
-rw-r--r--   1 ftphsbb    ftpuser      37279 Jan 31  2014 CASS_voice_NON_IMS_20140131_16810.cdr.ext
-rw-r--r--   1 ftphsbb    ftpuser      37280 Dec  8 10:00 CASS_voice_NON_IMS_20151208_33705.cdr.ext
-rw-r--r--   1 ftphsbb    ftpuser      37280 Feb 26  2014 CASS_voice_NON_IMS_20140226_23747.cdr.ext
-rw-r--r--   1 ftphsbb    ftpuser      37281 Apr 12  2014 CASS_voice_NON_IMS_20140412_31888.cdr.ext
-rw-r--r--   1 ftphsbb    ftpuser      38657 Feb 15 15:20 CASS_voice_NON_IMS_20160215_47855.cdr.ext
-rw-r--r--   1 ftphsbb    ftpuser      38657 Mar  3 14:40 CASS_voice_NON_IMS_20160303_51340.cdr.ext
-rw-r--r--   1 ftphsbb    ftpuser      38658 Apr  4  2014 CASS_voice_NON_IMS_20140404_30777.cdr.ext
-rw-r--r--   1 ftphsbb    ftpuser      38658 Nov 23 11:40 CASS_voice_NON_IMS_20151123_30547.cdr.ext
-rw-r--r--   1 ftphsbb    ftpuser      38862 Apr 14  2014 CASS_voice_NON_IMS_20140414_32198.cdr.ext
-rw-r--r--   1 ftphsbb    ftpuser      38862 Jun  1  2015 CASS_voice_NON_IMS_20150601_94805.cdr.ext
-rw-r--r--   1 ftphsbb    ftpuser      39020 Sep 12  2014 CASS_voice_NON_IMS_20140912_54032.cdr.ext

All of them are under /EDWH-DMT03/stgdata/HSBB/USAGE/ directory.

Anyway, your modified script give neat output indeed.

Thank you very much.

Hi aimy,
You're welcome.

Is the output you showed us from your original script, or from my suggested modification? I'm not clear as to whether or not the "neat output" my suggestion provided solved your problem???

  • Don
1 Like

Sorry to make you confused Sir.

Of course the neat output of your script solved my problem.

The one that I showed you is from the original script output. But what is wrong with the script I posted if you can detect it?

And by the way, doesn't that +10000 indicate the byte size? How could it be equivalent to the 512000 bytes?

Thank you.

It isn't a problem; I just wanted to be sure your problem had been fixed. I'm glad my suggestion worked.

The size reported by the lstat() system call for a directory varies depending on filesystem type. Some file systems report the number of files contained in the directory; some report the space needed to hold the i-node numbers and the names of the files contained in the directory; some report the accumulated sizes of the files contained in the directory (which I would guess is what happened in your case); and some report other values. Furthermore, when a file is unlinked from a directory, the size of the directory might or might not shrink. By adding the -d option to ls , the output reported just the directories larger than the size you specified instead of those directories AND the contents of those directories.

Sorry. The 512000 was a typo. It should have been 5120000. If you look at the find man page's description of the size primary, you'll see that the size specified is the number of 512 byte blocks; not the number of bytes. If you want files with sizes larger than 10,000 bytes (instead of larger than 10,000 512-byte blocks), use -size 10000c .

The change from -exec ls -lad {} \; (which causes find to invoke ls for each file found meeting your size limits) to -exec ls -lad + (which causes find to invoke ls with as many arguments as it can without overflowing ARG_MAX limits) just makes your script run a little faster (or, if there are a lot of files meeting your size limits, a lot faster).

1 Like

Thanks a lot again Mr. Don.

So meaning if I want get a list of all files that exceed 500MB, this should be the command right:

$ find 104857600 -xdev -size +104857600c -exec ls -lad {} +|sort -n -k 5,5 > LARGE.rst

But why did I got this error returned:

find: cannot stat 104857600

Thank you.

The 1st operand to find is the name of a file (usually of type directory) to be tested or operated on by the primaries specified by later operands. The diagnostic you got is saying that find was not able to find the file named 104857600 that you specified as a starting point for a list of files to be processed.

One would think that if you were looking for files larger than 500Mb, you would either want -size +524288000c (500*10241024 bytes) or -size +1024000 (500*10241024/512 512-byte blocks). Using -size +104857600c would be looking for files larger than 100Mb instead of 500Mb.

1 Like