Translate bash mathematical calculation to awk

this code below is very useful in calculating mean and quartiles. however, i would really like to translate it to awk without having to write to any external file:

#!/bin/sh

filename="tmp.txt"

sort -n $1 >$filename

rows=`wc -l $filename|cut -d' ' -f1`
q2=`echo "($rows+1)/2" |bc`

q1=`echo "$q2 / 2"|bc`
q3=`echo "3 * $q1" |bc`
echo  "Q1=  " `head -$q1 $filename|tail -1`
echo  "Q2= "`head -$q2 $filename|tail -1`
echo  "Q3= "`head -$q3 $filename|tail -1`

is this possible?

content of file which will be passed to $1:

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 
2 2 2 2 2 2 2 2 2 3 4 4 5 30 36 36 36 36 36 36 36 36 36 36 36 36 36 36 36 36 36 36 36 36 36 36 36 36 
36 36 36 36 36 36 36 36 36 36 36 36 36 36 36 36 36 36 36 36 36 36 36 36 36 36 36 36 36 36 36 36 36 36 
36 37 37 37 37 37 37 37 37 38 38 38 38 38 38 38 38 38 38 38 39 39 39 39 39 39 39 39 40 44 668 767 792
 795 803 805 805 805 805 805 805 805 806 806 806 806 806 806 806 806 807 807 807 808 863 868 875 883
 884 903 910 912 923 934 952 954 968 971 973 983 999 1007 1008 1017 1022 1044 1060 1125 6383 7275 9614 9629

What output are you expecting from this input?

What OS are you using?

If you want a bash script, why are you invoking /bin/sh instead of /bin/bash to interpret your script?

Why are you sorting a sorted 7 line file?

i only want a awk solution. i found the bash script posted in this thread online when i was looking for a solution on how to identify outliers in a range of numbers, which i posted above.

i figured the logic is there in that bash script, i just need to translate that to awk.

It could certainly be reworked in to better shell scripting to not need an intermediate file.
Reworking it in just plain awk is difficult since awk doesn't have it's own sort function.
How about this?

mute@thedoctor:~$ cat script
#!/bin/sh
tr -s ' \t\n' '\n' < "$1" | sort -n | awk '
  { a[NR]=$1 }
  END {
    print "Q1=" a[int(NR/4)]
    print "Q2=" a[int(NR/2)]
    print "Q3=" a[int(3*NR/4)]
}'

mute@thedoctor:~$ ./script file
Q1=1
Q2=36
Q3=38

From what I read about quartiles, this should be expanded to take the average of two values if the quarter falls between two. For this data set the answers match.

1 Like