Passing key column from parent to child records

Hi Forum.

I have this challenging issue that I'm hoping someone can help me.

I have a file that contains 3 different types of segments (AM00, AM01, AM32) in a hierarchy structure and I want to be able to pass the column key from the parent record to the children records.

AM00 - parent key: suffix# (underlined last column)
AM01 - parent key: card# (underlined last column)

Input data:
AM00,1,12345
AM01,ab,677
AM32,10

AM01,cd,789
AM32,20
--
AM00,2,23456
AM01,ef,654
AM32,30

AM01,gh,765
AM32,40
Output data:
AM00,1,12345
AM01,ab,677,12345
AM32,10,12345,677

AM01,cd,789,12345
AM32,20,12345,789
--
AM00,2,23456
AM01,ef,654,23456
AM32,30,23456,654

AM01,gh,765,23456
AM32,40,23456,765

Thank you for all your help.

Before we can help, please show us your attempts to solve the problem and where you are stuck.

I was actually thinking of reading each record and preserving each key for the segment and writing that key to the subsequent child record but that might be too slow since I will be dealing with a large file and in the future, I can have more than 3 segments.

If you have any ideas that you can suggest to get me started.

Thank you.

awk -F, '$1=="AM00" {n=$NF;print;next} NF && !/--/ {$(NF+1)=n}1' OFS=, myFile
1 Like

How about

awk '/AM00/ {SUFFIX = $NF} /AM01/ {CARD = $NF; $0 = $0 FS SUFFIX} /AM32/ {$0 = $0 FS SUFFIX FS CARD} 1' FS=, file3
AM00,1,12345
AM01,ab,677,12345
AM32,10,12345,677

AM01,cd,789,12345
AM32,20,12345,789
--
AM00,2,23456
AM01,ef,654,23456
AM32,30,23456,654

AM01,gh,765,23456
AM32,40,23456,765
1 Like

Thank you guys - your solution works great.

---------- Post updated 06-03-16 at 10:18 AM ---------- Previous update was 06-02-16 at 04:31 PM ----------

I just found out that I will be working with fixed width files instead of comma delimited and here's my code so far:

awk 'substr($0,13,4)=="AM00" {SUFFIX = substr($0,38,2)} substr($0,13,4)=="AM01" {$0 = $0 FS SUFFIX} 1'  file1

How do I reset the SUFFIX variable back to Null after processing each record?

I tested on a sample file and it seems that SUFFIX = "99" is being written to all AM01 records regardless if substr($0,13,4)=="AM00" from the parent record.

Thank you.

Input Data:
000000000001AM00 1500895700000000000199
000000000001AM01 035000000000013000399820810000000P

0000000000001AM00 1500895700000000000199
000000000001AM01 035000000000013000399820810000000P

Output Data:
000000000001AM00 1500895700000000000199
000000000001AM01 035000000000013000399820810000000P99

0000000000001AM00 1500895700000000000199
000000000001AM01 035000000000013000399820810000000P99

Second AM01 output record should not have SUFFIX of 99.

Why not? Its AM00 record HAS suffix 99, too! Except that the record is shifted right by one position; should the suffix be 19, then?

if AM00 does not start in position 13 then I don't want to add any suffix to children records.

Oh, I see - didn't jump to my eyes immediately.

If a "record separator" were a blank line, try e.g.

!NF {SUFFIX = ""};

to the begin of the one-liner.

Thank you Rudy.

This is my code but it didn't quite work - suffix = 99 still showing up on the 2nd record:

awk '!NF {SUFFIX = ""}; substr($0,13,4)=="AM00" {SUFFIX = substr($0,38,2)} substr($0,13,4)=="AM01" {$0 = $0 FS SUFFIX} 1'  file

on a side note, is it possible to store substr($0,13,4) in a variable?

I tried this but it didn't work:

awk '!NF {SUFFIX = ""}; var1=substr($0,13,4) "$var1"=="AM00" {SUFFIX = substr($0,38,2)} substr($0,13,4)=="AM01" {$0 = $0 FS SUFFIX} 1'  file

Is NF really zero (== EMPTY line, not even spaces in there)?

Assignment to a variable IS possible but should be done in an action part (i.e. within { and } )

also...
"$var1"=="AM00" should be var1 == "AM00"

This is my final code in an attempt to use variables:

awk 'rec_type1=substr($0,13,4) {$rec_type1=="AM00"} {SUFFIX = substr($0,38,2)} substr($0,13,4)=="AM01" {$0 = $0 FS SUFFIX} 1'  file

it's Not working as expected using variables

Input Data:
000000000001AM00 1500895700000000000100                                                                                     
000000000001AM01 035000000000013000399820810000000P 

Output Data:
000000000001AM00 1500895700000000000100                                                                                     
000000000001AM01 035000000000013000399820810000000P 98

it's grabbing SUFFIX = 98 for the current record, not from the previous one.

Please help.

Thank you.

Please get accustomed to reading man awk !

awk works on pattern {action} pairs for every input record: execute action if pattern is TRUE.

awk '
rec_type1=substr($0,13,4)                       # assignment; wrong: usually an action
{$rec_type1=="AM00"}                            # comparison; wrong: should be used as a pattern or within an "if" construct; wrong: the $ in front of var
{SUFFIX = substr($0,38,2)}                      # action alone; executed for every record as missing pattern is assumed TRUE
substr($0,13,4)=="AM01" {$0 = $0 FS SUFFIX}     # correct pattern {action} pair!
1                                               # TRUE pattern; executes default action: print $0
'  file
1 Like