I just wanted to know whether it is possible in scripting...to do interpolation....
if so....have a look on my data file
I need temperature and salinity value with a bin size of 0.5 m
output looks somewhat like this
dep temp sal
0.5 25 0.077
1 25 0.077
1.5
2
2.5
-
-
-
8.5 25 0.16
Note : one more thing here same dep value (0.789) repeated for around 22 rows so average of temperature and salinity should be taken, so whenever backward derivative equal to forward derivative average of temperature and salinity should be taken something like this
Here I am not completing asking you people to do the script for interpolation at least if you can able to do the averaging of temperature and salinity by checking forward and backward derivative it would be very helpful for me.
consider 1st column, you can find duplicate values 0.789 repeated for about 22 times, but in second and third column values are varying, because of duplicates in 1st column, I have to take average of 2nd and 3rd values (22 values), so what script has to do is whenever it finds duplicate value it has to do averaging..
example : let me take first one sample input here
column 1 column 2
1 20.2
1 21.2 -------> 3 values in 1st column are same
1 22.3 so I need average of column 2
2 22.5 average will be 21.13
3 25.9
4 26.8
4 26.9
4 26.7
output file should look like this
column 1 column 2
1 21.13
2 22.5
3 25.9
4 26.8
wherever script finds duplicates in 1st column it has to do the averaging.
I'm not sure I understand either. You want to generate data points every 0.5 m. Depth ranging from 0.789 to 8.305 you will get sth. like 16 data points, say 1, 1.5, ..., 8.5. So why don't you collect the temperatue and salinity into avarages around those abcissa values?
Why do you crack a nut with a sledge hammer, averaging 23 values for 0.789m, when everything will disappear in a large lump sum at 1.0m?
BTW, nit-picking, you can have difference quotients on those columns but no derivatives. And, I think it would be helpful to use one single sample file to discuss and attach, so col 1 would be col 1 everywhre.
And, finally answering your introductory question, yes, I'm pretty sure you can do (some degree of) interpolation in scripts.
awk 's != $1 && NR > 1{print s,a/X;a=0} # Here we check if NR > 1 & $1 != s(means $1 changes its value. in our case it changes from 1 to 2). Set a=0 for next row operation. And print s(previous $1), and a(total sum of $2 for same $1)/X(total occurrence of $1)
{a+=$2;s=$1;X[$1]++} # Here for every line we add $2 to a, set s=$1 and append $1 to array X.
END{print s,a/X}' # print s(last $1), and a(total sum of $2 for last $1)/X(total occurrence of $1)
Your script really helps me a lot, because my file is having about 20 columns, in which I need total 7 columns, what I need to change, suppose if main value to be checked in 5th column, and others to be averaged in 6,7, 8 ....column, then.
Here I am attaching one sample file in which column 5th column need to be checked, and 6th and 7th column need to be averaged whenever there is duplicates in 5th column.