Calculating totals in AWK

Hille · August 28, 2008, 5:12am

Hello,

With the following small script I list the size of documents belonging to a certain user by each time selecting the bytes-field of that file ($7). Now it fills the array with every file it finds so in the end the output of some users contains up to 200.000 numbers. So how can I calculate the total used diskspace of each user by using awk? By this the output would only contain 1 number, instead of thousands and also I wouldn't have to use other things to calculate the totals which would save me a lot of time.

Here's the script:
find /tmp -type f -ls | awk '{
print $7 >> "UsedDiskSpace_" $5 ".txt"
}'

Thanks in advance!

vidyadhar85 · August 28, 2008, 5:22am

to calculate total space using awk
awk '{space+=$7}END{print space}' >> "UsedDiskSpace_" $5 ".txt"

Hille · August 28, 2008, 5:48am

Thanks for your quick response, but how would you implement it in the script itself? I've tried it but it didn't went well. I'm rather new to the 'awk' command you see :).

Thanks

Franklin52 · August 28, 2008, 6:47am

The system I'm working on at the moment doesn't support the -ls option with find. Can you post some lines of the output of your find command?

Regards

Hille · August 28, 2008, 7:20am

Well, for example, when you perform a normal ls -l the output is:

-rwxr-xr-x 1 root exploit 325 Aug 9 2004 File1
-rwxr-xr-x 1 exp sys 384 Apr 12 2000 File2
-rwxrwxrwx 1 exp exploit 100 Mar 19 2007 File3
-rw-r--r-- 1 exp sys 597 Jun 24 1999 File4
-rwxr-xr-x 1 oracle system 242 Feb 5 2001 File5
-rwxr-xr-x 1 oracle system 184 Jul 5 2002 File 6
...

So from all those files I only need to know how big they are and then sort it per user. In the script I put the output from the awk command in a file, so it can be sorted by each user. If I open the DiskUsage_exp.txt file for example (from user 'exp'), this is the output:

384
100
597

But I have to perform the find command on the whole server, so you can imagine how many lines I would eventually get, the lines could ran up to thousands of numbers from a couple of users. So what I need is just the sum of those lines for each users. So for user 'exp' this would be '1081' and for user 'oracle' '426'.

Thanks

Franklin52 · August 28, 2008, 7:41am

This absorbs a lot of memory but you can give it a try:

find /tmp -type f -ls | awk '{a[$3]+=$5}END{for(i in a){print i, a}}' > UsedDiskSpace

Regards

Hille · August 28, 2008, 8:10am

Great! Thank you!

DJFX · October 3, 2008, 1:54am

I am new to unix, currently using ubuntu 8.04 hardy. I am doing some performance tests on my file system.

I am trying to use find | awk to traverse the file system getting the size of each file and directory and calculate how much space it would take for a given block size.

Example using arbitrary values, if i find a file 19k, it would be 19k with a blocking factor of 1k, the same file would be 20k with 2k blocking factor, etc.

I would like to find the sum of the size of all the files and the total space allocated for all the files so i can get the calculate wasted space.

I tried the script posted above by a fellow forum member but i am not certain if it is doing what i asked above, as i said i am somewhat new to unix.

Can anyone help me in implementing this script? It would be greatly appreciated.

Thanks in advance