How to parse this file using awk and output in CSV format?

My source file looks like this:

Cust-Number = "101"
Cust-Name="Joe"
Cust-Town="London"
Cust-hobby="tennis"
Cust-purchase="200"

Cust-Number = "102"
Cust-Name="Mary"
Cust-Town="Newyork"
Cust-hobby="reading"
Cust-purchase="125"

Now I want to parse this file (leaving out hobby) and output in csv format

Cust-Number   Cust-Name  Cust-Town  Cust-Purchase
101,Joe,London,200
102,Mary,Newyork,125

Anybody can help me out please?

Thanks
Balav

Please use code tags as required by forum rules!

Any attempts/ideas/thoughts from your side?

Hi Rudic

No, I am very new to shell programming and AWK principles. As part of a project I have to process the data in the files and load them into database tables.

Can someone help me?

Cheers

Howsoever, try

awk -F= '
BEGIN                   {HD = "Cust-Number,Cust-Name,Cust-Town,Cust-purchase"
                         print HD
                         HDCnt  = split (HD, HDArr, ",")
                         NXTREC = HDArr[1]
                         HDCM=","HD","
                        }

                        {gsub (/[ "]*/, "")
                        }

 $1 == NXTREC           {if (PR)        {for (i=1; i<=HDCnt; i++) printf "%s%s", RES[HDArr], (i == HDCnt)?"\n":","
                                         delete RES
                                        }
                         PR = 1
                        }

 HDCM ~ OFS $1 OFS      {RES[$1] = $0
                         sub ($1 FS, "", RES[$1])
                        }

END                     {for (i=1; i<=HDCnt; i++) printf "%s%s", RES[HDArr], (i == HDCnt)?"\n":","
                        }
' OFS="," file
Cust-Number,Cust-Name,Cust-Town,Cust-purchase
101,Joe,London,200
102,Mary,Newyork,125

Your sample's header doesn't use commas as does the rest of the file; this Ihave adapted. Should you insist on the spaces as field separators, correct script accordingly.

2 Likes

Hi Rudi

Thanks for your help. I didn't expect you to write a full code to a solution but just give me some recommendations, tips, etc. But it's really so kind of you. I'll try and understand this code myself.

I just ran it against my file but it gave me syntax errors. I am just wondering if the AWK works differently (syntax wise) across different Unix platforms. I am on SPARC Solaris

This is the error:

awk: syntax error near line 8
awk: illegal statement near line 8
awk: newline in string near line 8
awk: syntax error near line 10
awk: illegal statement near line 10
awk: syntax error near line 15
awk: bailing out near line 15

Let me quote Don Cragun for you:

1 Like

Another approach:

awk -F\" 'BEGIN{print "Cust-Number,Cust-Name,Cust-Town,Cust-Purchase"}
  /hobby/ || !NF{next} 
  /purchase/{print $2; next}
  {printf $2 ","}
' file
1 Like

Hi RudiC

Fantastic, thanks for your help. Very much appreciated. It works great.

Just one quick question for my knowledge

I am still trying to read and understand your code so that I can make necessary changes as required. If I can ask you, which part of the script really skips the fourth record? As I may need to skip 8th record as well in the real source file, I want to make sure I understand it.

Thanks a lot for your help again.

(Also wondering how I would mark this thread as Answered/Solved)

Actually, the header in the BEGIN section determines the selection AND order of the elements to be output. It is split into the HDArr, which turn serves as the index into the associatve array RES when printed.
If field 1 (including delimiters) is found in the header, RES is filled with $0 with $1 and the field separator removed (effectively field 2).

---------- Post updated at 16:37 ---------- Previous update was at 16:36 ----------

Add the tag "solved" for your second question.

HI Franklin,
could you please explain, Why after reading line having "purchase" string in it, Script print records to next line. If it is happening due to next command , Why it is not going in next line after reading /hobby/ || !NF{next} . Because this also contains next .
Thanks,

The print command appends a newline after a string and the printf doesn't.

1 Like