row seperate

i have this sample data:

test01 --- abc-01 name1
abc-02 name2
abc-03 name3

test02 --- abc-20 name4
abc-21 name5

test03 --- abc-22 name6
abc-23 name7

i want to generate a file that looks like:

test01 abc-01 name1
test01 abc-02 name2
test01 abc-03 name3
test02 abc-20 name4
test02 abc-21 name5
test03 abc-22 name6
test03 abc-23 name7

i know awk can use RS (record seperator) but i dont know much of it. pls help :slight_smile:

thanks!

Here's your sample data in a code block to remove ambiguity about spacing after fishing around in "view page source" in my browser. I also realigned it to what I guess your data really looks like. The fact of spacing matters more to awk than the amount of spacing anyhow, but everyone who posts here should be aware that alignment gets ruined by the VBcode->HTML conversion unless it's in a code block.

test01 ---   abc-01     name1
             abc-02     name2
             abc-03     name3

test02 ---   abc-20     name4
             abc-21     name5

test03 ---   abc-22     name6
             abc-23     name7

Fiddling with RS is not helpful here.
This should do what you want, though it is untested code, and you shouldn't run it until you understand it.

#!/usr/bin/awk
NF == 0 { next }                      # Skip empty lines
NF == 4 { tag = $1 }                  # Pick up the tag
        { print tag, $(NF - 1), $NF } # Print lines with data

The moderators will probably want to move this to the shell scripting forum.:smiley:

You know more than I do about awk - I would've come up with some convoluted mess to do that. :wink:

I can tell what your code is doing, but I'm wondering, won't the two lines

NF == 4 { tag = $1 }                  # Pick up the tag
        { print tag, $(NF - 1), $NF } # Print lines with data

only pick up on, say,

test01 ---   abc-01     name1

but not

             abc-02     name2
             abc-03     name3

?

hey, you are right about the spacing and the forum. i think this topic should be moved with the shell scripting forum. will the moderators do this for me :stuck_out_tongue:

thanks alot!

Nope. Your confusion may come from a mistaken notion that the pattern/action pairs look like exclusive cases, but you should look at them like a C switch statement: You fall through to the next "case" (awk "pattern") unless you encounter a "break" (awk "next"). So NF == 0 { next } skips empty lines. Then NF == 4 { tag = $1 } picks up the first field and goes to the next pattern when there are four fields. The next pattern is empty, so it matches every line that wasn't "next"-ed so far, and its action is to print the saved tag and the last two fields from the current input line. If there is exactly one field on a line, awk will break in this last case, but that wasn't part of the OP's stated problem.

how do i code block? i would like to make some corrections in my post :stuck_out_tongue:

When you're making a post to the forums, you should see some buttons above the textbox you type your message in. Look for a button with the # symbol on it to insert code. Read the help page for more on VB Code.

i would like to correct the sample data that i have:


test-01
               abc-01     name1
               abc-02     name2
               abc-03     name3
test-02
               abc-20     name4
               abc-21     name5
test-03
               abc-22     name6
               abc-23     name7

will have an output of:

test-01 abc-01 name1
test-01 abc-02 name2
test-01 abc-03 name3
test-02 abc-20 name4
test-02 abc-21 name5
test-03 abc-22 name6
test-03 abc-23 name7

thanks oombera for the tip :stuck_out_tongue:

:p
#!/usr/bin/awk
NF == 0 { next }                                # Skip empty lines
NF == 1 { tag = $1; next }                # Pick up the tag
NF == 3 { print tag, $(NF-1), $NF  }   # Print lines with data

hey, it worked!

can you explain me how did it worked? any tips on learning awk scripting :cool: . maybe you can suggest something beside google of course :smiley:

How'd it work? NF was never 3.

i changed NF == 3 to NF == 2

but what if my data contains same NF?

for example, instead of "test01" i have "test01 testname"?

thanks,

Then you have to find some other way to characterize your data. F'rinstance, your "tag" lines always seem to start at the beginning of a line while the "appended" lines do not, so replace "NF==1" with "/[1]/", i.e., the line begins with a non-blank character.

In ssow's version, you can delete the "NF==0 { next}" line, because you're only printing on lines with 2 fields (per your change).


  1. ^ ↩︎

The man page. Seriously. If you want a little more hand holding, O'Reilly's book Sed And Awk.