Finding directories with expression

Hi All,

I need your help in finding pattern of directories.

need to search for all pattern have "mypatern" from base directory folder.

example
-------

server1 - base directory

100 server1/ab_123456_1/mypattern
100 server1/ab_123456_2/mypattern
200 server1/ab_123457_1/mypattern
400 server1/ab_123457_3/mypattern

If the pattern "mypattern" is found in server1/ab_123456_1 and if another pattern is found ab_123456_2 then i want the to du -k mypattern and sum and the result saved as shown.

result.txt
---------
200k 123456/mypattern
600k 123457/mypattern

Appreciate your help in Advance.

Thanks
lxdorney

find * -type d -name mypattern | xargs -r du -sk | while read k n
  do
    $(( sum += k ))
  done
echo $sum 

very appreciate here thanks

find * -type d -name mypattern | xargs -r du -sk | while read k n 
do 
$(( sum += k ))  
done echo $sum

executed the above command got error message:

./test.sh: line 3: 8: command not found
./test.sh: line 3: 12: command not found
./test.sh: line 3: 16: command not found
./test.sh: line 3: 24: command not found
./test.sh: line 3: 28: command not found

but when execute this

find * -type d -name mypattern | xargs -r du -sk 

output:

8       server/ab_1234568_2/mypattern
4       server/ab_1234567_2/mypattern
4       server/ab_1234569_2/mypattern
8       server/ab_1234567_1/mypattern
4       server/ab_1234569_3/mypattern

is there any chances that can group the same sequence of directory filename?
8 server/ab_1234567_1/mypattern
4 server/ab_1234567_2/mypattern

4 server/ab_1234569_2/mypattern
4 server/ab_1234569_3/mypattern

the output would be

8   ab_1234569/mypattern    
12 ab_1234567/mypattern
awk -F"[ _]" '$0~p {a[$3]+=$1} END {for (i in a) print a"k "i"/"p}' p="mypattern" server1 > result.txt
cat result.txt
200k 123456/mypattern
600k 123457/mypattern
awk -F"[ _]" '$0~p {a[$3]+=$1} END {for (i in a) print a"k "i"/"p}' p="mypattern" server1> result.txt
awk: cmd. line:1: fatal: cannot open file `server1' for reading (Inappropriate ioctl for device)

my apologies, server1 is a directory not a file

my process goes like this:

mkdir -p server1/ab_1234568_2/mypattern
mkdir -p server1/ab_1234567_2/mypattern
mkdir -p server1/ab_1234569_2/mypattern
mkdir -p server1/ab_1234567_1/mypattern
mkdir -p server1/ab_1234569_3/mypattern

create some files

cd server1
ls /etc/ -l > ab_1234567_1/mypattern/etc.txt
ls /etc/ -l > ab_1234567_2/mypattern/etc.txt
ls /etc/ -l > ab_1234569_3/mypattern/etc.txt
ls /etc/ -l > ab_1234569_2/mypattern/etc.txt
cd server1
du -k


4       ./ab_1234568_2/mypattern
8       ./ab_1234568_2
16      ./ab_1234567_2/mypattern
20      ./ab_1234567_2
16      ./ab_1234569_2/mypattern
20      ./ab_1234569_2
16      ./ab_1234567_1/mypattern
20      ./ab_1234567_1
16      ./ab_1234569_3/mypattern
20      ./ab_1234569_3
92      .

is there any chances can get the below result something like this?

32 1234567/mypattern
4 1234568/mypattern
36 1234569/mypattern

sum of all color red
sum of all color green
sum of all color blue

really appreciate your help

I did assume that you have the information stored in file server1
You can use this

du -k | awk -F"[ _]+" '$0~p {a[$3]+=$1} END {for (i in a) print a"k "i"/"p}' p="mypattern" > result.txt
cat result.txt
32k 1234567/mypattern
4k 1234568/mypattern
32k 1234569/mypattern

Also added a + since du seems to have many space as separator.

Thank you so much.
after execute the script, notice is different from your output.
also adding ab_1234567_9 directory, supposed they added to the sum of 36k which is 2/mypattern/mypattern +16 and will become 52 and so on but is not.

steps:
d_path=/root/server1
pattern=mypattern

  1. find to $d_path with exact $pattern to all directory expression something like this 'ab_[0-9]\{6,\}\_[0-9]\{0,\}'
  2. grep my $pattern to all directory found my expression. ei ab_1234567_9 or ab_12345673334_96
  3. du -k to all $pattern that was found in expression and sum them all $pattern belong to same expression.
  4. slice the expression something like ab_1234567_9, ab_1234567_19 and ab_1234567_20 and so on become 1234567 and the final output would be:

total expression/pattern
32k 1234567/mypattern
4k 1234568/mypattern
32k 1234569/mypattern

wish you could elaborate or explain then is nice.

ls
result.txt  server1  server2  test.sh
pwd
/root
du -k | awk -F"[ _]+" '$0~p {a[$3]+=$1} END {for (i in  a) print a"k "i"/"p}' p="mypattern" > result.txt
cat  result.txt
36k 2/mypattern/mypattern
16k 3/mypattern/mypattern
16k 1/mypattern/mypattern
cp -r ab_1234567_2 ab_1234567_9
ls
ab_1234567_1  ab_1234567_2  ab_1234567_9  ab_1234568_2  ab_1234569_2   ab_1234569_3  result.txt
du -k | awk -F"[ _]+" '$0~p  {a[$3]+=$1} END {for (i in a) print a"k "i"/"p}' p="mypattern" >  result.txt
cat result.txt
36k 2/mypattern/mypattern
16k 3/mypattern/mypattern
16k 1/mypattern/mypattern
16k 9/mypattern/mypattern
find * -type d -name mypattern | sed '
  s/_[0-9]*\/mypattern$//
 ' | sort -u | while read p
 do
  sum=0
  ls ${p}_[0-9]/mypattern | xargs -r du -sk | while read k n
   do
    (( sum += k ))
   done
  echo $p $sum
 done

Find the mypattern dirs, remove any parent suffix and the child name, sort unique, and for each, start the sum at zero, find the size of all files in the family of directories and add to sum, and print out a summary line on this pattern.

1 Like

is there any way to show all directory without the name of directory in pattern.

find server/ \( ! -name mypattern  -prune \)
result:
server/1234567_1/mypattern
server/1234567_2/mypattern
server/1234567_3/mypattern
server/1234567_4/
server/1234567_5/mypattern

this would be the result only
server/1234567_4/

thanks

awk '!/mypattern/'
awk '$0!~p' p="mypattern"
1 Like

your script works now, but I noticed that if you have directory like this:

10k ab_12444_1/mypattern
10k ab_12444_2/mypattern
20k ab_12444/mypattern
cat result.txt
20k 12444/mypattern

instead of 
40k 12444/mypattern

is there any fix for this?

thanks

I removed a couple bugs in my script above.

1 Like

where's the script sir?
this script sir?

find * -type d -name mypattern | sed '   s/.*_[0-9]*\/mypattern$//  ' | sort -u | while read p  do   sum=0   ls ${p}_[0-9]/mypattern | xargs -r du -sk | while read k n    do     (( sum += k ))    done   echo $p $sum  done

Thanks

Yes, above -- I fixed it some more and added a narrative.

1 Like