Hello,
I am having trouble calculating some numbers and I was hoping someone could help me solve this.
I have one file with 1 column and what I'm trying to do is add up the lines until a certain value is reach, then jump to where it last finished counting and continue.
so for ex: if I want to count up until the value of 10 is reached in the file below, then I would stop in line 7 And then continue counting in line 8 until the end of the file.
does anyone have any ideas?
Without seeing the output you want, I'm left guessing what you really want, but so far it sounds like simple modulous.
awk '{ N += $1; $2=N%10 } 1' filename
What I am trying to do is just add up a column by segments, and a segment will be defined when the addition of several lines add up to a certain value -- 10 in this case.
so if the input is:
then the output should be
does this help?
awk 'N>=10 { $2=10; N=0 } 1; { N+=$1 }' filename
Thanks! that seems to be in the direction I am looking for!
When I tried
however, it did not produce the same result as >=. Why is that?
Because it's possible for it to skip straight past 10. Sometimes it might end up 11, or 12, if the numbers don't sum to an even 10.
Hi, Corona688:
Can you please explain a little bit what the "1" means in your script?
awk 'N>=10 { $2=10; N=0 } 1; { N+=$1 }' filename
I think I can catch the rest except the "1". Thanks
The 1 is an implied 'print'. That's the spot, right after an outer code block, where you can put an expression for when to print the line. Any expression will do. A 1 is always true, causing the line to always be printed.
Thank you,
is there a way to bypass the fact that it must end up on an even 10?
What if one wanted to expand on this code with the file below, for example:
Input file:
Col1 Col2 Col3 Col4
8 14.8425 0
2 6.59308 8.33343
1 10 1.02122 1.58249
1 1.88546 2.19513
1 1.6666 5.5956
1 92.9737 88.2462
1 3.13779 3.28445
1 4.30055 4.86017
1 32.9386 33.9011
1 9.70837 11.4292
2 17.9509 26.1322
23 10 0.20935 0.64897
In this case, how can I add up the values in col 3 and print the total in col5 and add the values of col4 and print in col6 if a 10 is found. So, in the example below, Col5 prints 165.58319 because the summation of the values of col3 equal that amount (from where the first 10 is written in col2 and ending in the line before the second 10 is found), and so on.
output file:
Col1 Col2 Col3 Col4 col5 col6
8 14.8425 0
2 6.59308 8.33343
1 10 1.02122 1.58249 165.58319 177.22654
1 1.88546 2.19513
1 1.6666 5.5956
1 92.9737 88.2462
1 3.13779 3.28445
1 4.30055 4.86017
1 32.9386 33.9011
1 9.70837 11.4292
2 17.9509 26.1322
23 10 0.20935 0.64897 131.101001 104.8667889
Post in code tags instead of quote tags so we can tell what your file really looks like.
You could 'cheat' by putting two columns in one. $1=$1" " N or something like.
the code tags worked
---------- Post updated 05-11-12 at 10:39 AM ---------- Previous update was 05-10-12 at 12:20 PM ----------
I do not understand what you mean by the code you wrote above. I tried to add it but I it does not sum up the values found in col3 between two col 2 "10" signals
Please post your original input data.
I am using a modified version of the code you provided
awk 'N>=100000 { $2="Loc"; N=0 } 1; { N+=$1 }' filename
the original input data looks like this:
263 1.35611 1.45967
1220 1.11776 1.04671
427 1.05608 2.29163
29647 14.1881 18.9483
29647 17.4202 4.50283
29647 7.26278 7.64333
1558 7.04701 12.8815
2853 21.2036 30.6511
165 3.19675 2.47742
351 0.635679 0.24632
700 3.6203 1.47666
952 0.0198475 0.0215731
242 14.8425 0
28339 6.59308 8.33343
28339 Loc 1.02122 1.58249
28339 1.88546 2.19513
28339 1.6666 5.5956
697 92.9737 88.2462
1162 3.13779 3.28445
2672 4.30055 4.86017
4758 32.9386 33.9011
4758 9.70837 11.4292
5054 17.9509 26.1322
429 Loc 0.20935 0.64897
and the output data should look like this:
263 1.35611 1.45967
1220 1.11776 1.04671
427 1.05608 2.29163
29647 14.1881 18.9483
29647 17.4202 4.50283
29647 7.26278 7.64333
1558 7.04701 12.8815
2853 21.2036 30.6511
165 3.19675 2.47742
351 0.635679 0.24632
700 3.6203 1.47666
952 0.0198475 0.0215731
242 14.8425 0
28339 6.59308 8.33343 99.5597965 91.9804731
28339 Loc 1.02122 1.58249
28339 1.88546 2.19513
28339 1.6666 5.5956
697 92.9737 88.2462
1162 3.13779 3.28445
2672 4.30055 4.86017
4758 32.9386 33.9011
4758 9.70837 11.4292
5054 17.9509 26.1322 191.2945965 187.3045231
429 Loc 0.20935 0.64897
Notice how the sum of the range of the the numbers in between Loc's in col3 and col4 are printed in col 5 and col6
Is the data tab-separated?
sorry about the format, I wish I could just upload the file. and I can convert it into tab delimited but it currently is not
You can upload the file actually. Look for 'manage attachments' when making a post.
You could describe what the file is, too, instead of what it isn't, reducing the number of questions needed.
sorry about the formatting inconvenience. I have attached what the input looks like and what the output looks like.