Find header in a text file and prepend it to all lines until another header is found

verdepollo · July 10, 2019, 9:50pm

I've been struggling with this one for quite a while and cannot seem to find a solution for this find/replace scenario. Perhaps I'm getting rusty.

I have a file that contains a number of metrics (exactly 3 fields per line) from a few appliances that are collected in parallel. To identify the appliance the metrics belong to, there's a header (the name of the asset basically) that is printed just once before all the rest of the text.

What I want to do is find the appliance name and print it before each one of the metrics. See below:

Sample input:

applianceName1
1.0 123 some
5.3 456 random
3.5 78900 string
applianceName2
1.2 1234 another
0.1 023 random
3.0 10 string
applianceName3
0.2 6544676 more
0.9 3543 random
4.1 123 stuff

Expected output:

applianceName1 1.0 123 some
applianceName1 5.3 456 random
applianceName1 3.5 78900 string
applianceName2 1.2 1234 another
applianceName2 0.1 023 random
applianceName2 3.0 10 string
applianceName3 0.2 6544676 more
applianceName3 0.9 3543 random
applianceName3 4.1 123 stuff

The ending file should, therefore, contain lines with 4 fields each one. Any clues or pointers would be appreciated.

Thanks!

jim_mcnamara · July 10, 2019, 10:50pm

Using awk -

$ awk '/^app/ {pre=$0; next}
    length($0) {print pre " " $0 } ' filename > newfile

$ cat newfile
applianceName1 1.0 123 some
applianceName1 5.3 456 random
applianceName1 3.5 78900 string
applianceName2 1.2 1234 another
applianceName2 0.1 023 random
applianceName2 3.0 10 string
applianceName3 0.2 6544676 more
applianceName3 0.9 3543 random
applianceName3 4.1 123 stuff

It's late so I cannot stay on long to help you figure out how to do it. Otherwise I would not have just blurted out a ho-hum answer.
Next time please tell us your OS and shell so we can give you good help. The idea is to get you able to do all this by yourself....

Chubler_XL · July 10, 2019, 11:04pm

I'm guessing that all the appliance names do not start with the text "app".

A bit more of a generic approach might be to treat lines with only one field as an appliance name (assuming here that there are no spaces within the appliance name). The awk system variable NF contains the number of fields in the line so instead of /^app/ (line begins with app) to identify headers you could use NF == 1 (Number of fields on this line equals one) or even NF != 3 (Number of fields is not 3)

verdepollo · July 10, 2019, 11:41pm

That worked out beautifully. And yes, the asset name could be any FQDN so no spaces at all. Thanks a lot!