Selection of records y time and offset.

Gents,

I have an csv file which contends value GPS time for each record and x, y coordinates.. The purpose is to identify which vps does not match with the law parameters in the relation time and offset.
In the graphic attached you can see example of 3 vps which are showed in red color and falls in the area not allowed. All records marked as blue are ok.

Then the thing is to sort the records by GPS time .. ( I did this part in my script ). calculate the time from GPS time as HHMMSS. ( done ).. now we need to duplicate say 10 time each record and write in front the next ones ( example analysis_25.csv ).
Here is the time to calculate the offset (column 18) and time (column 19) in the file analysis_25.csv..
Using the Law parameters we need to compute the Allowed_SlipTime ( column 19 ) and with this value we can define witch records are bad ( bad_records_25.csv) ...
In the graphic analysis_25.png we can see how the law works by time and offset
Values for law are:

Slip time(s))  18        18         6           0           0            0
Offset(m)       0      2000      6000       12000       12001        24000

Max time (s) 25
Max offset (m) 15000

Here my script till now.

awk 'NR>1{print $0}' $25.csv | awk -F, '{
                       line=substr($10,1,5)
                point=substr($11,1,5)
                ffid=$2
                sw=$5
                tape=$6;\
                x=$49;\
                y=$50;\
                tb=$46
                printf("%4d %5d %5d %4d %4d %10.1f %10.1f %s\n",tape,line,point,ffid,sw,x,y,tb)}' | sort -k8n |
                awk '{v=substr($8,16,1); gsub(/2/,"0",v);print substr($0,1,64) v}' > toto            
awk '\
        function gpstojdtime(gpstime) {\
            gpss = (gpstime/1000000)+10800;\
            jtmp = (gpss-((2017-1980)*365+10-6)*86400-18)/86400;\
            jday = int(jtmp);\
            htmp = 24*(jtmp - jday);\
            hour = int(htmp);\
            mtmp = 60*((htmp - hour));\
            minu = int(mtmp);\
            stmp = 60*(mtmp - minu);\
            secd = (stmp);\
            return hour*10000+minu*100+secd;\
        }\
         {\
            {\
                tape=$1;\
                splin=$2;\
                spnum=$3;\
                sprec=$4;\
                sw=$5;\
                x=$6;\
                y=$7;\
                spgps=$8;\
                gpt=$8;\
                printf"%4s %5d %5d %4s %4s %10.1f %10.1f %10.3f %s\n",\
                tape,splin,spnum,sprec,sw,x,y,gpstojdtime(spgps),gpt\
            };\
        }' toto | head -20

The support data is attached :

25.csv = Input data

analysis_25.png = Graphic results
analysis_25.csv = Data results - Desired output file 
bad_records_25.csv = Bad records results - Desired output file 

Appreciate your help

Gents,

Kindly, can help me with this.

Appreciate your help.

Can you give more details on how column #19 (Allowed_SlipTime) is calculated in the analysis_25.csv output file?

Im getting output like this:

3180 42763 52673 1096 3099   513013.4 2476037.60 5.38 3180 43199 51845 1097 3099   502662.0  2481487.4 7.58 2.20
3180 42763 52673 1096 3099   513013.4 2476037.60 5.38 3180 43195 52361 1098 3099   509113.7  2481438.4 13.16 7.78
3180 42763 52673 1096 3099   513013.4 2476037.60 5.38 3180 42757 52961 1099 3099   516613.1  2475961.9 18.62 13.24

But not sure how I calculate field #19 for these lines (0.30163, 5.3384 and 13.1985)

1 Like

Dear Xl

Thanks a lot to help me with this.

The input need to be sort by GPS time ( done )

For each record in the loop, need to compare it against the next record located in less than 18 sec in the future (colum 18).

Let say current record A ( 43169 52649 )has 4 records within the 18sec after A, it means we need to check offset and slip time for

A against B
 A against C
 A against D
 A against E
   1      2         3       4      5         6      7      8       9       10         11       12    13       14     15     16         17       18         19
Tape1LineName1PointNumber1FFID   X1        Y1     JD1  HHMMSS1  Tape2 LineName2 PointNumber2 FFID   X2       Y2    JD2   HHMMSS2    Offset  SlipTime   Allowed_SlipTime
3182   43169     52649    705 512712.9  2481109.4 25   90615.2   3182   43165       52073    706 505512.5 2481061.6 25   90620.4    7200.6        5.2      4.8        OK
3182   43169     52649    705 512712.9  2481109.4 25   90615.2   3182   43167       52937    707 516312.4 2481086.2 25   90628.6    3599.6       13.5      13.2       OK
3182   43169     52649    705 512712.9  2481109.4 25   90615.2   3182   42743       53237    708 519204.6 2475787.0 25   90634.2    8394.6       19.0      3.6        OK
3182   43169     52649    705 512712.9  2481109.4 25   90615.2   3182   42737       52613    709 512264.1 2475714.8 25   90638.5    5413.2       23.4      7.8        OK

3182   43165     52073    706 505512.5  2481061.6 25   90620.4   3182   43167       52937    707 516312.4 2481086.2 25   90628.6   10799.9        8.3      1.2        OK
3182   43165     52073    706 505512.5  2481061.6 25   90620.4   3182   42743       53237    708 519204.6 2475787.0 25   90634.2   14672.9       13.8      0.0        OK
3182   43165     52073    706 505512.5  2481061.6 25   90620.4   3182   42737       52613    709 512264.1 2475714.8 25   90638.5    8612.3       18.2      3.4        OK

3182   43167     52937    707 516312.4  2481086.2 25   90628.6   3182   42743       53237    708 519204.6 2475787.0 25   90634.2    6037.1        5.5      6.0       BAD
3182   43167     52937    707 516312.4  2481086.2 25   90628.6   3182   42737       52613    709 512264.1 2475714.8 25   90638.5    6726.1        9.9      5.3        OK
3182   43167     52937    707 516312.4  2481086.2 25   90628.6   3182   43173       53237    710 520062.3 2481163.6 25   90645.7    3750.7       17.0      12.7       OK
3182   43167     52937    707 516312.4  2481086.2 25   90628.6   3182   43167       52649    711 512712.4 2481086.4 25   90650.6    3600.0       22.0      13.2       OK

3182   42743     53237    708 519204.6  2475787.0 25   90634.2   3182   42737       52613    709 512264.1 2475714.8 25   90638.5    6940.9        4.4      5.1       BAD
3182   42743     53237    708 519204.6  2475787.0 25   90634.2   3182   43173       53237    710 520062.3 2481163.6 25   90645.7    5444.6       11.5      7.7        OK
3182   42743     53237    708 519204.6  2475787.0 25   90634.2   3182   43167       52649    711 512712.4 2481086.4 25   90650.6    8380.5       16.5      3.6        OK
3182   42743     53237    708 519204.6  2475787.0 25   90634.2   3182   43167       52073    712 505512.0 2481089.4 25   90655.6   14683.4       21.4      0.0        OK

3182   42737     52613    709 512264.1  2475714.8 25   90638.5   3182   43173       53237    710 520062.3 2481163.6 25   90645.7    9513.2        7.1      2.5        OK
3182   42737     52613    709 512264.1  2475714.8 25   90638.5   3182   43167       52649    711 512712.4 2481086.4 25   90650.6    5390.3       12.1      7.8        OK
3182   42737     52613    709 512264.1  2475714.8 25   90638.5   3182   43167       52073    712 505512.0 2481089.4 25   90655.6    8630.0       17.0      3.4        OK
3182   42737     52613    709 512264.1  2475714.8 25   90638.5   3182   42743       52349    713 508962.5  2475786  25   90702.2    3302.4       23.6      14.1       OK

For the calculation of the allowed slip time we need consider the distance, then the rule ds4 will tell us that for this distance the slip time must be than a certain offset value.
if this is not true it will mean that two records are in conflict. (example BAD)

Then, to compute the allowed slip time (column 19) we should use two arrays one for distance and other for sliptime.

example

for distance:
Offset(m)       0      2000      6000       12000       12001        24000

for time:
Slip time(s))  18        18         6           0           0            0

array= [0,18;2000,18;6000,6;12000,0;12001,0;24000,0]

Perhaps with the graphic analysis_25.png = Graphic results you can get more idea about this.

Thank you so much for your help..

OK I think this function should calculate the allowed split time:

     function allowedsliptime(distance) {
            if(distance < 2000) return 18
            if(distance > 12000) return 0
            if(distance < 6000) return 24 - distance / 333.333
            return 12 - distance / 1000
        }

And here is the code note it produces two output files analysis.csv and bad_records.csv these should pretty much match the format of your sample files.

awk -F, 'FNR>1 {
    gsub(/"/, "")
    line=substr($10,1,5)
    point=substr($11,1,5)
    ffid=$2
    sw=$5
    tape=$6;
    x=$49;
    y=$50;
    tb=$46
    printf("%4d %5d %5d %4d %4d %10.1f %10.1f %s\n",
      tape,line,point,ffid,sw,x,y,tb)}' 25.csv |
sort -k8n | awk '{v=substr($8,16,1); gsub(/2/,"0",v);print substr($0,1,64) v}' > toto
awk '
   BEGIN {
       split("analysis.csv bad_records.csv", OUT)
       for(j=1;j<3;j++) {
          printf "Tape1,LineName1,PointNumber1,FFID1,X1,Y1,JD1,HHMMSS1," > OUT[j]
          printf "Tape2,LineName2,PointNumber2,FFID2,X2,Y2,JD2,HHMMSS2," > OUT[j]
          printf "Offset,SlipTime,Allowed_SlipTime\n" > OUT[j]
       }
    }
    function gpstojdtime(gpstime) {
        gpss = (gpstime/1000000)+10800;
        jtmp = (gpss-((2017-1980)*365+10-6)*86400-18)/86400;
        jday = int(jtmp);
        htmp = 24*(jtmp - jday);
        hour = int(htmp);
        mtmp = 60*((htmp - hour));
        minu = int(mtmp);
        stmp = 60*(mtmp - minu);
        secd = (stmp);
        return hour*10000+minu*100+secd;
    }

    function allowedsliptime(distance) {
        if(distance < 2000) return 18
        if(distance > 12000) return 0
        if(distance < 6000) return 24 - distance / 333.333
        return 12 - distance / 1000
    }

    function calcoffset(x,y) {
        return sqrt((x[1]-x[2])^2+(y[1]-y[2])^2)
    }

    FNR==NR { L[FNR]=$0 ; next}
    {
        tape[1]=$1
        splin[1]=$2
        spnum[1]=$3
        sprec[1]=$4
        sw[1]=$5
        x[1]=$6
        y[1]=$7
        spgps[1]=$8
        dtime[1]=gpstojdtime(spgps[1])
        day[1]=jday

        # Look up to 10 records ahead
        for(i=1; i< 10 && (i+FNR in L); i++) {
            split(L[i+FNR],V)
            tape[2]=V[1]
            splin[2]=V[2]
            spnum[2]=V[3]
            sprec[2]=V[4]
            sw[2]=V[5]
            x[2]=V[6]
            y[2]=V[7]
            spgps[2]=V[8]
            dtime[2]=gpstojdtime(spgps[2])
            day[2]=jday

            # Skip lookahead when diff is greater than 25sec
            if(dtime[2] - dtime[1] > 25) next

            offset = calcoffset(x, y)
            stime = dtime[2] - dtime[1]
            allow = allowedsliptime(offset)

            # decide on output file based on splittime vrs allowedsplit
            if(stime < allow) of=OUT[2]
            else of=OUT[1]

            for(j=1;j<3;j++)
                printf"%d,%d,%d,%d,%.1f,%.1f,%d,%.2f,",
                    tape[j],splin[j],spnum[j],sprec[j],
                    x[j],y[j],day[j], dtime[j]  > of

            printf "%.4f,%.3f,%5.4f\n", offset, stime, allow > of
        }
    }' toto toto
1 Like

Dear Xl

Many thanks for your time and help..

Your help is highly appreciated, and your work excellent.

I will try the code..

Can you please , explain a little the way how works the function allowedsliptime(distance) in this part :

function allowedsliptime(distance) {
          if(distance < 6000) return 24 - distance / 333.333
        return 12 - distance / 1000

Thanks again

If the distance is less than 6000, it returns 24-(distance/333.333);

Otherwise, it returns 12-(distance/1000)

1 Like

allowedsplittime() matches with the slope from the graphic and with the values in your array. Notice how the slope of the allowed time changes from 3 sec per km to 1 sec per km at the 6 km point.

Below I run some of the distance values from analysis_25.csv thru this function and you can see the result matches with the contents of the Allowed_SlipTime column:

   Distance          Result
 11698.3675          0.3016
  9531.0848          2.4689
  6661.5539          5.3384
  5277.8263          8.1665
  3600.4959         13.1985
1 Like

Dear Xl

I have test the code and works perfect.

There is only one think to change.

the

 stime

should be in sec , then i was trying to change it like this.

     \# Skip lookahead when diff is greater than 25sec
	    stime1 = (dtime[2] - dtime[1])
	    	        
	    if((((dtime[2] - dtime[1])-100)+60) > 25) next

         	       	    
	    if(stime1>=40)stime = (((dtime[2] - dtime[1])-100)+60)
	    else stime = (dtime[2] - dtime[1]) 

kindly, can you check it. because i try it but does not work.

Everything else is working perfect.

Thanks for your help

We need to add a new function to convert jdtime to seconds; changes highlighted in red below:

awk -F, 'FNR>1 {
    gsub(/"/, "")
    line=substr($10,1,5)
    point=substr($11,1,5)
    ffid=$2
    sw=$5
    tape=$6;
    x=$49;
    y=$50;
    tb=$46
    printf("%4d %5d %5d %4d %4d %10.1f %10.1f %s\n",
      tape,line,point,ffid,sw,x,y,tb)}' 25.csv |
sort -k8n | awk '{v=substr($8,16,1); gsub(/2/,"0",v);print substr($0,1,64) v}' > toto
awk '
   BEGIN {
       split("analysis.csv bad_records.csv", OUT)
       for(j=1;j<3;j++) {
          printf "Tape1,LineName1,PointNumber1,FFID1,X1,Y1,JD1,HHMMSS1," > OUT[j]
          printf "Tape2,LineName2,PointNumber2,FFID2,X2,Y2,JD2,HHMMSS2," > OUT[j]
          printf "Offset,SlipTime,Allowed_SlipTime\n" > OUT[j]
       }
    }
    function gpstojdtime(gpstime) {
        gpss = (gpstime/1000000)+10800;
        jtmp = (gpss-((2017-1980)*365+10-6)*86400-18)/86400;
        jday = int(jtmp);
        htmp = 24*(jtmp - jday);
        hour = int(htmp);
        mtmp = 60*((htmp - hour));
        minu = int(mtmp);
        stmp = 60*(mtmp - minu);
        secd = (stmp);
        return hour*10000+minu*100+secd;
    }

    function jdtimetosec(value) {
       hour = int(value/10000)
       minu = int((value%10000)/100)
       secd = value%100

       return 3600*hour + 60*minu + secd
    }

    function allowedsliptime(distance) {
        if(distance < 2000) return 18
        if(distance > 12000) return 0
        if(distance < 6000) return 24 - distance / 333.333
        return 12 - distance / 1000
    }

    function calcoffset(x,y) {
        return sqrt((x[1]-x[2])^2+(y[1]-y[2])^2)
    }

    FNR==NR { L[FNR]=$0 ; next}
    {
        tape[1]=$1
        splin[1]=$2
        spnum[1]=$3
        sprec[1]=$4
        sw[1]=$5
        x[1]=$6
        y[1]=$7
        spgps[1]=$8
        dtime[1]=gpstojdtime(spgps[1])
        day[1]=jday

        # Look up to 10 records ahead
        for(i=1; i< 10 && (i+FNR in L); i++) {
            split(L[i+FNR],V)
            tape[2]=V[1]
            splin[2]=V[2]
            spnum[2]=V[3]
            sprec[2]=V[4]
            sw[2]=V[5]
            x[2]=V[6]
            y[2]=V[7]
            spgps[2]=V[8]
            dtime[2]=gpstojdtime(spgps[2])
            day[2]=jday

            offset = calcoffset(x,y)
            stime = jdtimetosec(dtime[2]) - jdtimetosec(dtime[1])
            allow = allowedsliptime(offset)

            # Skip lookahead when diff is greater than 25sec
            if(stime > 25) next

            # decide on output file based on splittime vrs allowedsplit
            if(stime < allow) of=OUT[2]
            else of=OUT[1]

            for(j=1;j<3;j++)
                printf"%d,%d,%d,%d,%.1f,%.1f,%d,%.2f,",
                    tape[j],splin[j],spnum[j],sprec[j],
                    x[j],y[j],day[j], dtime[j]  > of

            printf "%.4f,%.3f,%5.4f\n", offset, stime, allow > of
        }
    }' toto toto
1 Like

Dear Xl

Great, really wonderful job. Appreciate your help.. It works perfect.

Please the last thing here..

it is possible to add other output here:

split("analysis.csv bad_records.csv all.csv", OUT)
of=OUT[3]

in order to have complete data..
I can concatenate both outputs and get complete one, but will be great to add this in the code.

Thanks Again...

Try this:

awk -F, 'FNR>1 {
    gsub(/"/, "")
    line=substr($10,1,5)
    point=substr($11,1,5)
    ffid=$2
    sw=$5
    tape=$6;
    x=$49;
    y=$50;
    tb=$46
    printf("%4d %5d %5d %4d %4d %10.1f %10.1f %s\n",
      tape,line,point,ffid,sw,x,y,tb)}' 25.csv |
sort -k8n | awk '{v=substr($8,16,1); gsub(/2/,"0",v);print substr($0,1,64) v}' > toto
awk '
   BEGIN {
       split("analysis.csv bad_records.csv all.csv", OUT)
       for(j=1;j<=3;j++) {
          printf "Tape1,LineName1,PointNumber1,FFID1,X1,Y1,JD1,HHMMSS1," > OUT[j]
          printf "Tape2,LineName2,PointNumber2,FFID2,X2,Y2,JD2,HHMMSS2," > OUT[j]
          printf "Offset,SlipTime,Allowed_SlipTime\n" > OUT[j]
       }
    }
    function gpstojdtime(gpstime) {
        gpss = (gpstime/1000000)+10800;
        jtmp = (gpss-((2017-1980)*365+10-6)*86400-18)/86400;
        jday = int(jtmp);
        htmp = 24*(jtmp - jday);
        hour = int(htmp);
        mtmp = 60*((htmp - hour));
        minu = int(mtmp);
        stmp = 60*(mtmp - minu);
        secd = (stmp);
        return hour*10000+minu*100+secd;
    }

    function jdtimetosec(value) {
       hour = int(value/10000)
       minu = int((value%10000)/100)
       secd = value%100

       return 3600*hour + 60*minu + secd
    }

    function allowedsliptime(distance) {
        if(distance < 2000) return 18
        if(distance > 12000) return 0
        if(distance < 6000) return 24 - distance / 333.333
        return 12 - distance / 1000
    }

    function calcoffset(x,y) {
        return sqrt((x[1]-x[2])^2+(y[1]-y[2])^2)
    }

    FNR==NR { L[FNR]=$0 ; next}
    {
        tape[1]=$1
        splin[1]=$2
        spnum[1]=$3
        sprec[1]=$4
        sw[1]=$5
        x[1]=$6
        y[1]=$7
        spgps[1]=$8
        dtime[1]=gpstojdtime(spgps[1])
        day[1]=jday

        # Look up to 10 records ahead
        for(i=1; i< 10 && (i+FNR in L); i++) {
            split(L[i+FNR],V)
            tape[2]=V[1]
            splin[2]=V[2]
            spnum[2]=V[3]
            sprec[2]=V[4]
            sw[2]=V[5]
            x[2]=V[6]
            y[2]=V[7]
            spgps[2]=V[8]
            dtime[2]=gpstojdtime(spgps[2])
            day[2]=jday

            offset = calcoffset(x,y)
            stime = jdtimetosec(dtime[2]) - jdtimetosec(dtime[1])
            allow = allowedsliptime(offset)

            # Skip lookahead when diff is greater than 25sec
            if(stime > 25) next

            # decide on output file based on splittime vrs allowedsplit
            if(stime < allow) of=OUT[2]
            else of=OUT[1]

            for(j=1;j<3;j++) {
                printf"%d,%d,%d,%d,%.1f,%.1f,%d,%.2f,",
                    tape[j],splin[j],spnum[j],sprec[j],
                    x[j],y[j],day[j], dtime[j]  > of

                printf"%d,%d,%d,%d,%.1f,%.1f,%d,%.2f,",
                    tape[j],splin[j],spnum[j],sprec[j],
                    x[j],y[j],day[j], dtime[j]  > OUT[3]
            }
            printf "%.4f,%.3f,%5.4f\n", offset, stime, allow > of
            printf "%.4f,%.3f,%5.4f\n", offset, stime, allow > OUT[3]
        }
    }' toto toto
1 Like

Dear Xl

Many, thanks for the code.. As I say previously your job is wonderful.

Thanks a lot for your time, support and help.

Regards..