Hi all,
I am new to shell scripting and wanna calculate the mean and standard deviation using shell programming.
I have a file with letters that are repeating and their corresponding duration
a 0.32
a 0.89
aa 0.34
aa 0.23
au 0.012
au 0.26
SIL 0.34
ai 0.9
b 0.29
bh 0.19
ssil 0.87
I want to calculate the mean and standard deviation for each letter. I am able to calculate for single letter, but cant do for whole date at a time.
You obviously want to match "a", but not "aa" or "ai", etc., right? As your file has spaces delimiting the first field use these to limit your matched lines to only the wanted ones:
grep '^a ' file | ...
Notice the space character behind the "a" - this will match only the lines starting with "a" but not these starting with "a<something>".
Then replace the space by a tab character in my solution. It is possible to search for any character, you just have to take care that the shell doesn't devour the more fancy characters with a special meaning to it. This is what the single quotes around the regexp are for.
Replace in the following examples the "<tab>" with a literal tab character, i just write it that way to make it readable:
While this would not work as expected, because the shell would take the tab char:
echo "abc<tab>def" | grep c<tab>d
The following would work indeed:
echo "abc<tab>def" | grep 'c<tab>d'
Notice the difference: the quotation marks around the regexp.