Hi All,
Been struggling with sorting values in descending order for an associative awk array.
When I iterate through the awk array (mem) without asort, it has pid -> kb format like this:
pid kb
123 2048
315 1024
136 4096
Background: Need to sort based on the second column (memory) in reverse order to get a list of PIDs and memory usage from highest to lowest usage. The PID will be used later in "ps -ef" to check which process corresponds to that PID for each iteration.
Expected output
136 4096
123 2048
315 1024
Code:
#!/bin/bash
proc_ids=$(grep -e AnonHugePages /proc/*/smaps | grep -v "0 kB" | awk -F'/' '{proc[$3]++} END {for (i in proc) print i}')
for proc_id in $proc_ids; do
grep -e AnonHugePages /proc/"$proc_id"/smaps | gawk -v pid=$proc_id 'NF>0 && $(NF-1) > 4 {mem[pid]+=$(NF-1)} {num= asort(mem, dest)} END {for (i=1;i<=num;i++) print dest[i]}'
done
INPUT
[root@xxx~]# grep -e AnonHugePages /proc/*/smaps | awk '$(NF-1) > 4{print}'
/proc/1080/smaps:AnonHugePages: 4096 kB
/proc/1080/smaps:AnonHugePages: 2048 kB
/proc/15259/smaps:AnonHugePages: 2048 kB
/proc/15259/smaps:AnonHugePages: 2048 kB
/proc/1935/smaps:AnonHugePages: 2048 kB
/proc/25459/smaps:AnonHugePages: 2048 kB
/proc/25459/smaps:AnonHugePages: 2048 kB
/proc/25459/smaps:AnonHugePages: 2048 kB
/proc/2613/smaps:AnonHugePages: 2048 kB
/proc/2616/smaps:AnonHugePages: 2048 kB
/proc/2616/smaps:AnonHugePages: 2048 kB
/proc/2616/smaps:AnonHugePages: 2048 kB
/proc/2616/smaps:AnonHugePages: 2048 kB
/proc/2616/smaps:AnonHugePages: 2048 kB
/proc/2616/smaps:AnonHugePages: 2048 kB
/proc/2616/smaps:AnonHugePages: 2048 kB
/proc/2616/smaps:AnonHugePages: 2048 kB
/proc/2616/smaps:AnonHugePages: 2048 kB
/proc/27807/smaps:AnonHugePages: 2048 kB
/proc/27807/smaps:AnonHugePages: 2048 kB
/proc/27808/smaps:AnonHugePages: 2048 kB
/proc/27808/smaps:AnonHugePages: 2048 kB
/proc/28130/smaps:AnonHugePages: 2048 kB
/proc/28576/smaps:AnonHugePages: 28672 kB
/proc/28576/smaps:AnonHugePages: 2048 kB
/proc/28576/smaps:AnonHugePages: 2048 kB
/proc/28576/smaps:AnonHugePages: 4096 kB
/proc/28576/smaps:AnonHugePages: 2048 kB
/proc/28576/smaps:AnonHugePages: 25165824 kB
/proc/28576/smaps:AnonHugePages: 25165824 kB
/proc/29577/smaps:AnonHugePages: 2048 kB
/proc/29577/smaps:AnonHugePages: 2048 kB
/proc/29726/smaps:AnonHugePages: 4096 kB
/proc/29726/smaps:AnonHugePages: 2048 kB
/proc/3339/smaps:AnonHugePages: 2048 kB
/proc/3341/smaps:AnonHugePages: 2048 kB
/proc/3356/smaps:AnonHugePages: 2048 kB
/proc/3357/smaps:AnonHugePages: 2048 kB
/proc/3357/smaps:AnonHugePages: 2048 kB
/proc/3988/smaps:AnonHugePages: 6144 kB
/proc/3988/smaps:AnonHugePages: 6144 kB
/proc/3988/smaps:AnonHugePages: 2048 kB
/proc/3988/smaps:AnonHugePages: 4096 kB
/proc/4728/smaps:AnonHugePages: 6144 kB
/proc/4728/smaps:AnonHugePages: 6144 kB
/proc/4728/smaps:AnonHugePages: 6144 kB
/proc/6280/smaps:AnonHugePages: 2048 kB
/proc/6280/smaps:AnonHugePages: 2048 kB
/proc/6280/smaps:AnonHugePages: 2048 kB
/proc/6334/smaps:AnonHugePages: 2048 kB
/proc/6334/smaps:AnonHugePages: 2048 kB
/proc/6334/smaps:AnonHugePages: 2048 kB
/proc/6334/smaps:AnonHugePages: 2048 kB
/proc/652/smaps:AnonHugePages: 2048 kB
/proc/8042/smaps:AnonHugePages: 2048 kB
/proc/8042/smaps:AnonHugePages: 2048 kB
/proc/975/smaps:AnonHugePages: 10240 kB
CURRENT OUTPUT
grep -e AnonHugePages /proc/"$proc_id"/smaps | awk -v pid=$proc_id '$(NF-1) > 4 {mem[pid]+=$(NF-1)} END {for (pid in mem) print pid,mem[pid]}'
8042 4096
3341 2048
4728 18432
29726 6144
27807 4096
EXPECTED OUTPUT
4728 18432
29726 6144
8042 4096
27807 4096
3341 2048
Thanks.
@sand1234 , asort is GNU awk function. Plesse check if you have GNU awk in your system. If in case you don't have it then you could pipe output of awk command to sort where you could sort it with 2nd field. Try it out and let us know how it goes then.
Thanks,
R. Singh
Hi Ravinder,
I tried it but got following error, could you please check code once?
gawk: cmd. line:1: (FILENAME=- FNR=8) fatal: sort comparison function `descending' is not defined
Thanks.
Also I don't want to pipe to sort as I want to re-use the sorted data again. Prefer to do it within awk.
@sand1234 , thank you for editing your question but still not fully clear. Could you please do add sample of expected output with logic of how to get it in your question, so that it will be clear and i could try to help you more.
Thanks,
R. Singh
sand1234:
$(NF-1)
If there are blank lines this will result in your error. It is necessary to exclude them.
NF > 0 && $(NF-1) > 4
or
NF && $(NF-1) > 4
Hi,
I updated the code, added your suggestion and removed the quotes from descending.
The code doesn't throw any error now, but the first column (indices) is not displayed and the second column is still not sorted.
grep -e AnonHugePages /proc/"$proc_id"/smaps | gawk -v pid=$proc_id 'NF>0 && $(NF-1) > 4 {mem[pid]+=$(NF-1)} {num= asort(mem, copy, descending)} END {for (i=1;i<=num;i++) print copy[i],mem[copy[i]]}'
4096
2048
18432
6144
4096
18432
4096
18432
2048
2048
4096
6144
2048
6144
6144
4096
2048
50370560
8192
2048
4096
2048
@sand1234 , Very good that you have shown us your efforts in form of code. Could you please do add sample of input and expected output in your question as requested before too once question is clear we could help you more on same.
Thanks,
R. Singh
Hi,
It looks like asort is not actually sorting the values of the array "copy", when I print copy[i] to get the value, I get the values shown above which are not in descending order. I tried by removing descending from asort, but the values still don't get sorted.
Thanks.
Looks like you're using gawk. And you're trying sort the array based on the numeric value of the cells (not the indices). Try using gawk specific extension forcing a certain way of 'navigating' arrays..
grep -e AnonHugePages /proc/"$proc_id"/smaps | \
awk -v pid=$proc_id '
BEGIN {
PROCINFO["sorted_in"]="@val_num_desc"
}
$(NF-1) > 4 {mem[pid]+=$(NF-1)}
END {
for (pid in mem) print pid,mem[pid]
}'
You probably can get rid of grep and do it all with awk - I'll leave it up to you.
Hi vgersh99,
Thanks for looking at this. I've confirmed that gawk with PROCINFO works on my system.
# gawk '
> BEGIN {
> PROCINFO["sorted_in"]="@val_num_desc"
> a[4] = 2048
> a[3] = 102353626
> for (i in a)
> print i, a[i]
> }'
3 102353626
4 2048
However the script doesn't work. The second column is not sorted, either in ascending nor descending order. Wondering if you have any tips to debug this?
# ./testhuge.sh
8042 4096
3341 2048
4728 18432
29726 6144
27807 4096
3988 18432
27808 4096
2616 18432
28130 2048
1935 2048
15259 4096
6280 6144
3356 2048
25459 6144
1080 6144
3357 4096
3339 2048
28576 50366464
6334 8192
Thanks.
Relevant code:
for proc_id in $proc_ids; do
grep -e AnonHugePages /proc/"$proc_id"/smaps | \
gawk -v pid=$proc_id '
BEGIN {
PROCINFO["sorted_in"]="@val_num_desc"
}
$(NF-1) > 4 {mem[pid]+=$(NF-1)}
END {
for (pid in mem) print pid,mem[pid]
}'
done
Hmmm.... strange.
I've mimicked your table of PIDs in a file:
8042 4096
3341 2048
4728 18432
29726 6144
27807 4096
3988 18432
27808 4096
2616 18432
28130 2048
1935 2048
15259 4096
6280 6144
3356 2048
25459 6144
1080 6144
3357 4096
3339 2048
28576 50366464
6334 8192
and ran:
gawk '
BEGIN {
PROCINFO["sorted_in"]="@val_num_desc"
}
{mem[$1]+=$2}
END {
for (pid in mem) print pid,mem[pid]
}' tablePIDs.txt
which yielded the following expected output:
28576 50366464
3988 18432
2616 18432
4728 18432
6334 8192
29726 6144
6280 6144
25459 6144
1080 6144
27807 4096
27808 4096
3357 4096
15259 4096
8042 4096
3341 2048
3339 2048
1935 2048
3356 2048
28130 2048
So the implementation is correct.
Could you attach a couple of smaps files for a couple of pids, pls.
Also, could you change:
for (pid in mem) print pid,mem[pid]
to
for (i in mem) print i,mem[i]
Hi vgersh99,
Adding bash -x debug output, and info about smaps.
+ proc_ids='8042
3341
4728
29726
27807
3988
27808
2616
28130
1935
15259
6280
3356
25459
1080
3357
3339
28576
6334
652
29577
2613'
+ for proc_id in '$proc_ids'
+ gawk -v pid=8042 '
BEGIN {
PROCINFO["sorted_in"]="@val_num_desc"
}
$(NF-1) > 4 {mem[pid]+=$(NF-1)}
END {
for (i in mem) print i,mem[i]
}'
+ grep -e AnonHugePages /proc/8042/smaps
8042 4096
+ for proc_id in '$proc_ids'
+ gawk -v pid=3341 '
BEGIN {
PROCINFO["sorted_in"]="@val_num_desc"
}
$(NF-1) > 4 {mem[pid]+=$(NF-1)}
END {
for (i in mem) print i,mem[i]
}'
+ grep -e AnonHugePages /proc/3341/smaps
3341 2048
+ for proc_id in '$proc_ids'
+ grep -e AnonHugePages /proc/4728/smaps
+ gawk -v pid=4728 '
BEGIN {
PROCINFO["sorted_in"]="@val_num_desc"
}
$(NF-1) > 4 {mem[pid]+=$(NF-1)}
END {
for (i in mem) print i,mem[i]
}'
4728 18432
# grep -e AnonHugePages /proc/8042/smaps
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 2048 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 2048 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
]# grep -e AnonHugePages /proc/3341/smaps
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 2048 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
]# grep -e AnonHugePages /proc/4728/smaps
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 6144 kB
AnonHugePages: 0 kB
AnonHugePages: 6144 kB
AnonHugePages: 0 kB
AnonHugePages: 6144 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
AnonHugePages: 0 kB
Complete script:
#!/bin/bash
proc_ids=$(grep -e AnonHugePages /proc/*/smaps | grep -v "0 kB" | awk -F'/' '{proc[$3]++} END {for (i in proc) print i}')
for proc_id in $proc_ids; do
grep -e AnonHugePages /proc/"$proc_id"/smaps | \
gawk -v pid=$proc_id '
BEGIN {
PROCINFO["sorted_in"]="@val_num_desc"
}
$(NF-1) > 4 {mem[pid]+=$(NF-1)}
END {
for (i in mem) print i,mem[i]
}'
done
you're iterating over a list of PIDs and for each PID invoking awk and trying to sort where you have only ONE line for the selected PID.
The order of output is the order in which PIDs are in $proc_ids array - you're not sorting ALL the pids, but each PID individually .
You'll need to rethink your approach where you pass ALL the files for ALL the PIDs to awk, add up the memory for each PID AND output ALL the pids/memory combo.
Thanks vgersh99. I will think about it.
Here's something to start with.
assuming
$ find
.
./proc
./proc/3341
./proc/3341/smaps
./proc/4728
./proc/4728/smaps
./proc/8042
./proc/8042/smaps
./sand.sh
with smaps files containing your sample data.
And sand.sh containing (dropping the shell's grep ):
#!/bin/bash
#set -x
awk '
BEGIN {
PROCINFO["sorted_in"]="@val_num_desc"
}
FNR==1 {
n=split(FILENAME, f, "/")
pid=f[n-1]
}
/^AnonHugePages/ && $(NF-1) > 4 {mem[pid]+=$(NF-1)}
END {
for (i in mem) print i,mem[i]
}' $(find . -type f -name 'smaps')
running sand.sh yields:
4728 18432
8042 4096
3341 2048