Finding when a file switches direction using awk

Hi guys I have a file that is filled with x,y values:

0.000000 0.00129578
0.000191 0.00272187
0.000381 0.0125676
0.000572 0.0120014
0.000763 0.00203461
0.000954 0.00682248
0.001144 0.00202773
0.001335 0.000840523
0.001526 0.00451419

....5MB of that

I wanted to know if there is a way to modify this awk statement to find out when column two switches directions. What i mean is the number in column two will get bigger then start to get smaller. When this happens I want to print out the value of column one.

I used this to find the maximum value:

#awk ' { if ( $2 >= .9998 && $2 <= 1.0001 ) print $1 }' fileName.xy

I want to do something like this:

#awk ' { if ( [1 Row above $2]  >= $2 && [1 Row below $2] <= $2 ) print $1 }' fileName.xy

any ideas how to actually code/specify [1 Row above $2] and [1 Row below $2]?

This will print out the change point rows:

awk '
    NR == 1 { last = $2; last_row = $0; next; }
    {
        if( $2 < last )
        {
            printf( "%s\n", last_row );
            printf( "%s\n", $0 );
            exit( 0 );
        }

        last = $2;
        last_row = $0;
    }
' inputfile

If the input file is:

1 2
2 4
3 8
4 10
5 8
4 6

The output from the programme will be:

4 10
5 8

I think this is what you had in mind. It will handle floating point numbers as well, my simple test was just that: simple.

awk 'NR==1{last=$2;last_row=$1;getline;cur=$2;cur_row=$1;next} 
     {if (last>=cur&&$2<=cur) print cur_row; 
      last=cur;last_row=cur_row;cur=$2;cur_row=$1}' infile 

Niether of them worked; the problem was it seemed to pick random points not the peaks (I plotted them using GMT).

Instead of accepting defeat can I trouble you for a little explanation, if I understood the logic a little better I think I could tweak it. I am still new at awk but my professor loves it, and the more I use it the more I realize how powerful it actually is.

awk 'NR==1{last=$2;lost_row=$1;getline;cur=$2;cur_row=$1;next}

This is the part I don't quite understand. What exactly does NR == 1 accomplish? Your setting the number of rows equal to one? does a ';' seperate commands? So you set last equal to column 2, lost_row equal to column one.
Why do your run a getline? I looked it up and it is defined as:
getline - returns 1 if it finds a record, and 0 if the end of the file is encountered

Anyways I get the feeling that $1 and $2 are now rows above and below not columns?

  {if (last>=cur&&$2<=cur) print cur_row; 
      last=cur;last_row=cur_row;cur=$2;cur_row=$1}' infile

This part logically makes sense, and this is what I am trying to accomplish.

Some special variables that AWK sets for each record (usually a line) that it reads:

NR = record number (by default the record separator, RS, is a newline, so NR is often the current line number). NR==1 {...} means to execute the commands in braces if this is the first record that AWK is processing.

NF = after field splitting, the number of fields (columns) in the current record.

$1 = the value of the first field

$2 = the value of the second field

$NF = the value of the last field.

Regards,
Alister

2 Likes

So that makes sense. I played around with it and have a data set that looks like:

 1	2	3	4	5	6	7
12	4	5	0	8	5	4
12	98	7	675	98	89	98

if I:

awk 'NR==3 {print $2}' file1.txt

I get 98. if I make NR == 2 i get 4.

So is there a way to check run an if statement with NR-1?

I want to compare the value directly above and below it?

awk ' { if ( [NR-1, $2]  <= $2 && [NR+1, $2] <= $2 ) print $1 }' fileName.xy

I need to specify that location make sense? there is a numeric value one row above (NR-1) and in field 2 (which I abbreviated as [NR-1,$2] ) that I want to compare to the current row NR and field 2 (abbreviated $2).

is there a way to specify that in awk?

Once again so grateful for the responses!

In your request, you need three lines compare with, so need record the previous two lines each time.

NR==1{last=$2;last_row=$1;getline;cur=$2;cur_row=$1;next} will read the first two lines. getline will jump to next line easily within one line code.

If I write by this way, you will easily understand

NR==1{last=$2;last_row=$1} 
NR==2{cur=$2;cur_row=$1}

The rest code will start from line 3 which you already understood.

Test from your sample data, my code is fine. If there are any problems, you need paste some real data for test.

1 Like

Ok i tinkered with it and it makes sense now! Thank you so much. I think I will use that trick quite a bit. :smiley: