Hi,
Please tell me how to include parallel processing for the below code. Thanks in advance
I have a list of users directories in root directory. Each user has a directory by his /her username.
I am finding the size of each directorry using du -g command.. and checking if the size exceeds 3GB a limit.
The problem is that it takes around 30 minutes for around 1000 users.
for i in `ls -l | grep -i <username>`
do
du -g $i | awk '{if ($1 > 3) print $0}' >> size.txt
done
The script as posted does not work for several reasons. For example, where does "<username>" come from? What is "ls -l" for? What does the "list of users directories in root directory" look like, what created the file, and where is that file?
Please post the script you actually used.
If these are user home directories (same as those in /etc/passwd) there are much easier ways of finding the totals. I wouldn't expect user home directories to be directly under the root directory so maybe this is not what you are trying to do.
Is there any reason to believe doing the lookups in parallel will be faster? The performance limiter is probably going to be how fast the data can be retrieved from disk anyway.
It'd also probably be faster to just pass the file names to one instance of du instead of running du over and over again.
I note that matrixmadhan has used "ls -1" (number one) which makes more sense than the "ls -l" (letter ell) in the original post.
Because the original post contains "du -g" I wonder if this is an IBM AIX machine? i.e. one with a very limited command line length.