awk and sum

This is my file

vol0 285GB
vol0.snapshot 15GB
vol11_root 0GB
vol12_root 47GB
vol12_root.snapshot 2GB

I need the output

vol0 285GB,vol0.snapshot 15GB,sum-300GB
vol11_root 0GB,nosnap,sum-0Gb
vol12_root 47GB,vol12_root.snapshot 2GB,49GB

I was trying to use paste -d, --. But i having issue i need take only the line which has .snapshot and if no shapshot found i need to add has nosnap

For adding the value of two data and provide has SUM no idea how to do

What operating system are you using?

What shell are you using?

What did you try with paste ?

How are we supposed to guess when the sum is to be printed with "GB" and when it is to be printed with "Gb"?

How are supposed to guess when "sum-" is to be included in the output and when it is to be omitted?

Are all numbers to be added given as "GB" values? Or could "KB", "MB", "TB" and/or other multipliers be present?

What operating system are you using?
I am using cywin

What shell are you using?

shell

What did you try with paste ?

cat file|paste -d, - - (I used this command to append both lines but the issue some lines dont have snapshot details)

How are we supposed to guess when the sum is to be printed with "GB" and when it is to be printed with "Gb"?
All will in GB

How are supposed to guess when "sum-" is to be included in the output and when it is to be omitted?

One i get the two date
"vol0 285GB,vol0.snapshot 15GB" using awk to sum the integer

Are all numbers to be added given as "GB" values? Or could "KB", "MB", "TB" and/or other multipliers be present?
[/quote]

Here is one approach:-

awk '
        {
                match ( $0, /vol[0-9]*/ )
                vol = sprintf ( "%s", substr( $0, RSTART, RLENGTH ) )
        }
        !/snapshot/ {
                A[vol FS "orig"] = $0
                T[vol] += ( $NF + 0 )
        }
        /snapshot/ {
                A[vol FS "snap"] = $0
                T[vol] += ( $NF + 0 )
        }
        END {
                for ( k in T )
                        print A[k FS "orig"], A[k FS "snap"] ? A[k FS "snap"] : "no snap", "sum-" T[k] "Gb"


        }
' OFS=, file

another:

awk '
{
   w=$1; sub("[.].*", "", w);
   if(! a[w]) {b[c++]=w; g=0;}
   a[w]=a[w] $0 ",";
   if ($NF ~ /[0-9]*GB$/) {l=$NF; gsub("[^0-9]", "", l); s[w]=(g+=l);}
}
END {
   for (i=0; i<c; i++) {
      print a [b]((a [b]~ /snapshot/) ? "" : "nosnap,") "sum-" s [b]"GB";
   }
}
' datafile
1 Like

Or

awk '
BG      {BG = 0
         if (/snap/)    {SUM += $2
                         printf "%s,sum-%dGB\n", $0, SUM
                         SUM = 0
                         next
                        }
         else            printf "%s,sum-%dGB\n", "nosnap", SUM
         SUM = 0
        }
!BG     {printf "%s,", $0
         SUM += $2
         BG = 1
        }
' file

I can believe this works great. But still i dont know how it works

awk '
{
   w=$1; sub("[.].*", "", w);                                                      # strip first word
   if(! a[w]) {b[c++]=w; g=0;}                                                     # if word not read before load into word counter (keep the order read); reset gigabit sum;
   a[w]=a[w] $0 ",";                                                               # concatenate line string into word array
   if ($NF ~ /[0-9]*GB$/) {l=$NF; gsub("[^0-9]", "", l); s[w]=(g+=l);}             # strip number from last word; add value to sum for word
}
END {
   for (i=0; i<c; i++) {                                                           # loop for words read
      print a [b]((a [b]~ /snapshot/) ? "" : "nosnap,") "sum-" s [b]"GB"; # print word array line stored, add "nosnap" if not found, sum for word in array
   }
}
' datafile
1 Like

Another approach:

awk '
{
  split($1,F,".")
  i=F[1]
  A=A $0 ","
  T+=$2
}
END {
  for(i in A)
    printf "%ssum-%sGB\n",A,T
}
' file

But you did not answer one of Don Cragun's questions in post #2

This is quite essential, because the approaches in this thread will be fail if the file can also contain KB, MB or TB values.

1 Like

The value will be in GB only