Hi ! whether it is possible to do interpolation in scripting...

Hi ! Experts...

I just wanted to know whether it is possible in scripting...to do interpolation....

if so....have a look on my data file

I need temperature and salinity value with a bin size of 0.5 m

output looks somewhat like this

dep   temp  sal
0.5     25     0.077
1        25     0.077
1.5
2
2.5
-
-
-
8.5     25     0.16

    

Note : one more thing here same dep value (0.789) repeated for around 22 rows so average of temperature and salinity should be taken, so whenever backward derivative equal to forward derivative average of temperature and salinity should be taken something like this

                mask  depth
1:   .... 1:  1.000  0.789  ---backward derivative
2     /   2:  1.000  0.789  ---forward derivative
3     /   3:  1.000  0.789
4     /   4:  1.000  0.789
5     /   5:  1.000  0.789
23    /  23:   ....    0.789
24    /  24:   ....    0.790
25    /  25: 1.000 0.790
26    /  26: 1.000 0.790


Here I am not completing asking you people to do the script for interpolation at least if you can able to do the averaging of temperature and salinity by checking forward and backward derivative it would be very helpful for me.

I don't understand what you're trying to achieve. Could you please explain a bit more.

Sometimes, Understanding a question is much harder than providing a solution.

Cheers,
Korean :slight_smile:

Try to understand..

consider 1st column, you can find duplicate values 0.789 repeated for about 22 times, but in second and third column values are varying, because of duplicates in 1st column, I have to take average of 2nd and 3rd values (22 values), so what script has to do is whenever it finds duplicate value it has to do averaging..

example : let me take first one sample input here

column 1 column 2  
1                 20.2
1                 21.2        -------> 3 values in 1st column are same 
1                 22.3                   so I need average of column 2
2                 22.5                   average will be 21.13
3                 25.9
4                 26.8
4                 26.9
4                 26.7

output file should look like this

column 1 column 2
1               21.13
2               22.5
3               25.9
4               26.8

wherever script finds duplicates in 1st column it has to do the averaging.

I hope now its clear

$ cat file
1 20.2
1 21.2
1 22.3
2 22.5
3 25.9
4 26.8
4 26.9
4 26.7

$ awk 's != $1 && NR > 1{print s,a/X;a=0}{a+=$2;s=$1;X[$1]++}END{print s,a/X}' file
1 21.2333
2 22.5
3 25.9
4 26.8

I'm not sure I understand either. You want to generate data points every 0.5 m. Depth ranging from 0.789 to 8.305 you will get sth. like 16 data points, say 1, 1.5, ..., 8.5. So why don't you collect the temperatue and salinity into avarages around those abcissa values?
Why do you crack a nut with a sledge hammer, averaging 23 values for 0.789m, when everything will disappear in a large lump sum at 1.0m?
BTW, nit-picking, you can have difference quotients on those columns but no derivatives. And, I think it would be helpful to use one single sample file to discuss and attach, so col 1 would be col 1 everywhre.

And, finally answering your introductory question, yes, I'm pretty sure you can do (some degree of) interpolation in scripts.

Dear Pamu, if you have time can you please explain 2nd line (awk) of script.. please

Please check..

awk 's != $1 && NR > 1{print s,a/X;a=0}   # Here we check if NR > 1 & $1 != s(means $1 changes its value. in our case it changes from 1 to 2). Set a=0 for next row operation. And print s(previous $1), and a(total sum of $2 for same $1)/X(total occurrence of $1)

{a+=$2;s=$1;X[$1]++}   # Here for every line we add $2 to a, set s=$1 and append $1 to array X.

END{print s,a/X}'   # print s(last $1), and a(total sum of $2 for last $1)/X(total occurrence of $1)

I Hope this helps.. :slight_smile:

pamu

Your script really helps me a lot, because my file is having about 20 columns, in which I need total 7 columns, what I need to change, suppose if main value to be checked in 5th column, and others to be averaged in 6,7, 8 ....column, then.

Could you please provide sample input. It would be helpful to solve this...:slight_smile:

Hi, Pamu

Here I am attaching one sample file in which column 5th column need to be checked, and 6th and 7th column need to be averaged whenever there is duplicates in 5th column.

Try
I have printed here only $5,and average of $6 and $7.

awk 's != $5 && NR > 1{print s,a/X,b/X;a=0;b=0}{a+=$6;b+=$7;s=$5;X[$5]++}END{print s,a/X,b/X}' file

Yes, Pamu...Awesome, its working....now I understood completely, thank you so much, I will modify for my files now.