awk question ? set 2 variables by matching fields

Hello,
I'm trying to get the TOP and BASE numbers printed out
File looks like this:

2300  CAR   # 2300 is the TOP
2310  CAR
2335  CAR
2455  CAR   # 2455 is the BASE
1000  MOTOR  # 2455 will become this TOP
2000  MOTOR  
3000  MOTOR
4000  MOTOR   # 4000 is the BASE
2345  BIKE     # 4000 becomes this TOP
2347  BIKE     # 2347 is the BASE

The output needs to READ like this:

CAR        2300 2455
MOTOR  2455 4000
BIKE        4000 2347

please let me know if this needs more explanation,
my files are 500,000 ROWS of data in this format.

Thanks in advance for your Help.

This might work..

 
awk '/TOP/{v=$2" "$4};/BASE/{b=$4;c=1}{if(c){print v" "b;c=0}}' filename
awk '
        ++A[$2] == 1 {
                if ( NR == 1 )
                        printf "%s %d ", $2, ( t ? t : $1 )
                else
                        printf "%d\n%s %d ", ( t ? t : $1 ), $2, ( t ? t : $1 )
        }
        {
                t = $1
        }
        END {
                printf "%d\n", t
        }
' file

This could be shorten some to:

awk '/TOP/{v=$2" "$4};/BASE/{print v" "$4}'

No need for the if statement to test for a true variable:

{if(c){print v" "b;c=0}}

Could be written like this

c{print v" "b;c=0}

Completely agree with you, However as you know writing c or if(c) will take same number of CPU cycles to evalute it so its not going to make any proformace issues as far as this holds good I prefer to write more presentable and redable code for the user as most of them are quite new to unix :wink:

Its all depends on Individual working style.

1 Like

I did not know this, and agree with you that write it readable is more important than short.

Semicolon is also not needed
awk '/TOP/{v=$2" "$4};/BASE/{print v" "$4}'
awk '/TOP/{v=$2" "$4} /BASE/{print v" "$4}'

Thanks everybody, I'm going to try Yoda's solution.
I must have confused some on the format of my file, I only put the comments # this is TOP
# This will be the BASE For clarity. I wish I could just match on those names, but the only text to grab
is from one NAME to when the NAME changes.

I appreciate your help.

I only put the comments

That's why its important to post real data, and lots of it.

I still have some problem to understand your logic.

CAR        2300 2455
MOTOR  2455 4000
BIKE        4000 2347

CAR I do understand.
MOTOR why not 1000-4000
BIKE 2345 2347 ??

Edit:
Something like this:

awk 'NR==1 {top=$1} {t2=t1;s2=s1;t1=$1;s1=$2} NR!=1&&s2!=s1 {print s2,top,t2;top=$1} END {print s2,top,t1}'
CAR 2300 2455
MOTOR 1000 4000
BIKE 2345 2347

Hello Jotne,
Thanks for helping me out, I will post the real data as soon as I
can.

The First number of CAR is 2300
The Last Row of CAR is: 2455
2300 is the TOP of Lithology
2455 is the BASE of Lithology
The BASE number of 2455 needs to be used for the next Row of
MOTOR 2455 The BASE of previous becomes the TOP.

The examples have helped me out, thanks again.
Charlie