Consolidate several lines of a CSV file with firewall rules, in order to parse them easier?

Consolidate several lines of a CSV file with firewall rules

Hi guys.
I have a CSV file, which I created using an HTML export from a Check Point firewall policy.
Each rule is represented as several lines, in some cases. That occurs when a rule has several address sources, destinations or services.
I need the output to have each rule described in only one line.
It's easy to distinguish when each rule begins. In the first column, there's the rule ID, which is a number.

Let me show you an example. The strings that should be moved are in bold:

NO.;NAME;SOURCE;DESTINATION;VPN**;SERVICE;ACTION;TRACK;INSTALL ON;TIME;COMMENT
1;;fwxcluster;mcast_vrrp;;vrrp;accept;Log;fwxcluster;Any;"VRRP;;*Comment suppressed*
;;;;;igmp**;;;;;
2;;fwxcluster;fwxcluster;;FireWall;accept;Log;fwxcluster;Any;"Management FWg;*Comment suppressed*
;;fwmgmpe**;fwmgmpe**;;ssh**;;;;;
;;fwmgm**;fwmgm**;;;;;;;
3;NTP;G_NTP_Clients;cmm_ntpserver_pe01;;ntp;accept;None;fwxcluster;Any;*Comment suppressed*
;;;cmm_ntpserver_pe02**;;;;;;;

What I need ,explained in pseudo code, is this:

Read the first column of the next line. If there's a number:
Evaluate the first column of the next line. If there's no number there, concatenate (separating with a comma) \
the strings in the columns of this line with the last one and eliminate the text in the current one

The output should be something like this. The strings in bold are the ones that were moved:

NO.;NAME;SOURCE;DESTINATION;VPN**;SERVICE;ACTION;TRACK;INSTALL ON;TIME;COMMENT
1;;fwxcluster,fwmgmpe**,fwmgm**;mcast_vrrp,fwmgmpe**,fwmgm**;;vrrp,ssh**;accept;Log;fwxcluster;Any;*Comment suppressed*
;;;;;;;;;;
;;;;;;;;;;
3;NTP;G_NTP_Clients;cmm_ntpserver_pe01,cmm_ntpserver_pe02**;;ntp;accept;None;fwxcluster;Any;*Comment suppressed*
;;;;;;;;;;

The empty lines are there only to be more clear, I don't actually need them.

Thanks!

Try this awk solution:

awk -F\; '
$1{Save=$0}
/^;/{
  for(i=split(Save,Prev,";");i;i--)
    if (!length($i)) $i=Prev
}1' OFS=\; infile

Hey Cluber, thanks for this code.
My knowledge of awk is null, basically.
I ran it and got to this:

pgawk: script.awk:1: awk -F\; '
pgawk: script.awk:1:       ^ backslash not last character on line

I'm running this with Gawk for Windows (not at my laptop here).
Is that the reason why I'm getting this error?
If it is, I'll get a Linux VM running to test it out.

Thanks.

Hi guys!

If anyone needs it, I ended up solving this with this code:

import csv 
# adjust these 3 lines 
WRITE_EMPTIES = False 
INFILE = "input.csv"
OUTFILE = "output.csv"
with open(INFILE, "r") as in_file: 
  r = csv.reader(in_file, delimiter=";") 
  with open(OUTFILE, "wb") as out_file: 
    previous = None 
    empties_to_write = 0 
    out_writer = csv.writer(out_file, delimiter=";") 
    for i, row in enumerate(r): 
      first_val = row[0].strip() 
      if first_val: 
        if previous: 
          out_writer.writerow(previous) 
          if WRITE_EMPTIES and empties_to_write: 
            out_writer.writerows( 
              [["" for _ in previous]] * empties_to_write 
              ) 
            empties_to_write = 0 
        previous = row 
      else: # append sub-portions to each other 
        previous = [ 
          "|".join( 
            subitem 
            for subitem in existing.split(",") + [new] 
            if subitem 
            ) 
          for existing, new in zip(previous, row) 
          ] 
        empties_to_write += 1 
    if previous: # take care of the last row 
      out_writer.writerow(previous) 
      if WRITE_EMPTIES and empties_to_write: 
        out_writer.writerows( 
          [["" for _ in previous]] * empties_to_write 
          ) 

You can still use gawk on windows.

Place code in file like expand_csv.awk:

BEGIN{OFS=FS}
$1{Save=$0}
/^;/{
  for(i=split(Save,Prev,";");i;i--)
    if (!length($i)) $i=Prev
}1

And run it from the windows command line like this:

C:> awk -F; -f expand_csv.awk infile