Help with awk percentage calculation from a file

i have a file say test with the below mentioned details

Folder Name  Total space  Space used
/test/test1   500.1GB       112.0 GB
/test/test2   3.2 TB          5TB
/test/test3   3TB              100GB

i need to calculate percentage of each row based on total space and space used and copy the contents of particular rows in a new file (test1) which has crossed more than 90%(here some are in gb and some in TB). Kindly help

Hello venkitesh,

Welcome to forums, could you please try following and let me know if this helps.

awk '{TOT=$2+0;gsub(/[[:digit:]]|\./,X,$2);USED=$3+0;gsub(/[[:digit:]]|\./,X,$3);if(($2 ~ /GB/ || $2 ~ /TB/)&& TOT>=USED){VAL=USED/TOT * 100};if($2 ~ /TB/ && $3 ~ /GB/ && TOT>=USED){VAL=USED/(TOT * 1024) * 100};if(VAL>20){print $1" filesystem has more disk space used than threshold."};VAL=""}'  Input_file

Output will be as follows.

/test/test1 filesystem has more disk space used than threshold.

So in above code I have put 20 as threshold, you could set it as per your requirement. I hope this helps(It only handles filesystem capacity in either TBs and GBs).
EDIT: Adding a non-one liner form of solution now too.

awk '{
        TOT=$2+0;
        gsub(/[[:digit:]]|\./,X,$2);
        USED=$3+0;
        gsub(/[[:digit:]]|\./,X,$3);
        if(($2 ~ /GB/ || $2 ~ /TB/) && TOT>=USED){
                                        VAL=USED/TOT * 100
                                  };
        if($2 ~ /TB/ && $3 ~ /GB/ && TOT>=USED){
                                                VAL=USED/(TOT * 1024) * 100
                                               };
        if(VAL>20){
                        print $1" filesystem has more disk space used than threshold."
                  };
        VAL=""
     }
   '   Input_file
 

Thanks,
R. Singh

1 Like

Hi Ravinder,

Thanks for the script.

is it possible to print the output like this.

test1(removing the first slash)   500gb 100gb 98%

Hello venkitesh,

Could you please try following and let me know if this helps.

awk '{TOT=$2+0;gsub(/[[:digit:]]|\./,X,$2);USED=$3+0;gsub(/[[:digit:]]|\./,X,$3);if(($2 ~ /GB/ || $2 ~ /TB/)&& TOT>=USED){VAL=USED/TOT * 100};if($2 ~ /TB/ && $3 ~ /GB/ && TOT>=USED){VAL=USED/(TOT * 1024) * 100};if(VAL>20){sub(/.*\//,X,$1);print $1,TOT $2,USED $3,VAL"%"};VAL=""}'  Input_file
 

Output will be as follows.

test1 500.100.1GB 11212.0GB 22.3955%
 

Adding a non-one liner form of solution successfully too now.

awk '{
        TOT=$2+0;
        gsub(/[[:digit:]]|\./,X,$2);
        USED=$3+0;
        gsub(/[[:digit:]]|\./,X,$3);
        if(($2 ~ /GB/ || $2 ~ /TB/)&& TOT>=USED){
                                        VAL=USED/TOT * 100
                                  };
        if($2 ~ /TB/ && $3 ~ /GB/ && TOT>=USED){
                                                        VAL=USED/(TOT * 1024) * 100
                                               };
        if(VAL>20)                             {
                                                        sub(/.*\//,X,$1);
                                                        print $1,TOT $2,USED $3,VAL"%"
                                               };
        VAL=""
     }
    '    Input_file
 

Thanks,
R. Singh

HI ravinder,

is there any way we can keep tb and gb as it is instead of converting all to gb's

as some are coming like this 50.010.01GB .

Thanks

Hello venkitesh,

No, It is not touching the values from Input_file. It is showing TB/GB as per Input_file only, so let's say we have following Input_file.

cat Input_file
/test/test1 500.1GB 112.0GB
/test/test2 3.2TB 1.3TB
/test/test3 3TB 100GB

Then following is the code.

awk '{
        TOT=$2+0;
        gsub(/[[:digit:]]|\./,X,$2);
        USED=$3+0;
        gsub(/[[:digit:]]|\./,X,$3);
        if(($2 ~ /GB/ || $2 ~ /TB/) && TOT>=USED){
                                        VAL=USED/TOT * 100
                                  };
        if($2 ~ /TB/ && $3 ~ /GB/ && TOT>=USED){
                                                        VAL=USED/(TOT * 1024) * 100
                                               };
        if(VAL>20)                             {
                                                        sub(/.*\//,X,$1);
                                                        print $1 FS TOT $2 FS USED $3 FS VAL"%"
                                               };
        VAL=""
     }
    '   Input_file

Output will be as follows.

test1 500.1GB 112GB 22.3955%
test2 3.2TB 1.3TB 40.625%

Thanks,
R. Singh

HI ravinder,

please find the output what i am getting

original file content

/test/test1/test1       200GB    192.7GB

after applying awk command to this file output comes like this

test1        20000GB  192.792.7GB  96.35%

if i change the same content of input file as above

it comes like this

test1  500.100.1GB  11212.0GB  22.3955%

Hello venkitesh,

I had assumed that there are no spaces between digit values of a filesystem and GB/TB strings as I saw discrepancy in your very first Input_file, so if you do not have space between filesystem's values and TB/GB strings this command should run then. Let me know if you have any queries on same.

Thanks,
R. Singh

If you look at the links at the bottom left of this page (under "More UNIX and Linux Forum Topics You Might Find Helpful"), you might find very useful hints. For instance, this post applied to your problem in post#1 immediately yields

Folder Name  Total space  Space used	percent
/test/test1   500.1GB       112.0GB	 22.40%
/test/test2   3.2TB          5TB	 156.25%
/test/test3   3TB              100GB	 3.26%

It requires that numbers and units NOT be separated by spaces, though. 5TB used on a 3.2TB disk is quite ambitious, isn't it?
Searching these forums may yield further proposals and hints...

HI Ravinder

worked as expected

thanks a ton for your help