I would like to count a consecutive number in column 2 which are grouped by column 1 as well as obtain the maximum number of consecutive no. in each group.
That is a part of result which represents the stacking interaction in DNA. Column 1 represents time, Column 2 is residue in DNA participating stacking interaction. I just want to know the how many stacks are formed in a consecutive manner.
Since raw data is too complex to post here, I just omitted raw data and gave the example in order to explain more efficiently.
Hi,
I am working with very similar data and Scrutinizer's answers have been very helpful. However, I was wondering how one would alter the output a bit. For example, if I had Ryan Kim's data in his post 'One more question':
0 1
0 2
0 3
0 4
0 5
1 1
1 2
1 3
What if I wanted to print out all of the actual lines that correspond to a series of lines with at least n consecutive values in column 2? For example, if I had n=4 (or 5) and the above data, I would want to extract and print the following lines of data:
0 1
0 2
0 3
0 4
0 5
If I had n=3, I would extract all of the lines from the original dataset.
I altered Scrutinizer's awk solution slightly to allow filtering the series based on the number of lines with consecutive values in column two:
But, I can't figure out how to print the actual series of lines with the consecutive values in them. Any possible advice/explanations would be greatly appreciated!