Thanks for your comments and sorry for not clarifying enough and my late reply,
what i was trying to do is to calculate the pearson correlation between the median of gene expression and GC content to help me in my study research.
I was doing lots of part of the project using bash script in anaconda that's why i want to continue using it. but this seems challenging for me so i used python and online tool already. I am grateful for your comments it was really helpful :slight_smile .
also, I followed the formula equation and do step by step by awk like that and it worked with me, its not very nice and very long but it worked with me
awk -F, '{print $1 "," $2 "," $3 "," $2 * $3 "," $2 * $2 "," $3 * $3}' input > output
awk -F',' '{ for (i=2;i<=NF;i++) sum[i]+=$i } END { for (i in sum) printf("%f ", sum[i])}' output> output1
sed -e 's/\s\+/,/g' output1 > output2
awk -F',' -v factor=157364 '{print $1 "," $2 "," $3 "," $4 "," $5 "," $3 * factor "," $4 * factor "," $5 *factor "," $1 * $2}' output2 > correlation_output
awk -F"," '{printf ("%s,%2.5f,%2.5f,%2.5f,%2.5f,%2.5f,%2.5f,%2.5f,%2.5f,%2.5f,%2.5f,%2.5f,%2.5f,%2.3f", $1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$6-$9,$7-$10,$8-$11)}' correlation_output > correlation_parameter.txt
awk -F"," '{print $0 "," $13 * $14}' correlation_parameter.txt > correlation_parameter_modified.txt awk -F"," '{r=$15; print $0 "," sqrt(r)}' correlation_parameter_modified.txt > last_parameter.txt
awk -F',' '{print "R=" $12/$16 }' last_parameter.txt > correlation_value.txt
Thanks a lot