Help with manipulating the output on a script

rrb2009 · May 2, 2012, 3:02pm

Hi All,

I have a question on eliminating spaces from a output.

A command returns me output like this

Attribute                            Value 
---------------                   ---------------

Total Capacity                     500 GB 
Utilization                            10 % 
Used Capacity                      50 GB

I need to convert the output like this to a csv or xls file

Total Capacity   Utilization     Used Capacity 

   500 GB            10%              50 GB

Can anyone help how can i do this using sed or awk ? I tried few but nothing worked ? Can anyone help me ?

Corona688 · May 2, 2012, 3:12pm

$ cat totalcap.awk

# Field separator is two or more spaces
# ORS="\r\n" makes the text Windows-readable,
# omit it to make an ordinary flatfile.
BEGIN { FS="   *"; ORS="\r\n"     }
# Ignore first three lines, then store in array, keep order in O
NR>3 { sub(/ *$/, ""); A[$1]=$2; O[++L]=$1 }

END {
        S="";   for(N=1; N<=L; N++) S=S","O[N];
        print substr(S, 2);
        S="";   for(N=1; N<=L; N++) S=S","A[O[N]];
        print substr(S, 2);
}

$ awk -f totalcap.awk data

Total Capacity,Utilization,Used Capacity
500 GB,10 %,50 GB

$

Depending on your system you may need to use nawk or gawk instead of awk.

rrb2009 · May 2, 2012, 3:31pm

Thanks for your quick help. How should i run this along with the command ?

command | awk -f <awkfilename> or awk -f <awkfilename> ' command ' ?

Corona688 · May 2, 2012, 3:33pm

command | awk -f totalcap.awk > outputfile

rrb2009 · May 2, 2012, 3:39pm

Hi,

I am sorry , the first command i posted works but my command output has couple of other statistics also apart from the 3 I gave in the example. How to pick only those 3 ? like row 3,4 & 5 out of 10 rows.

Corona688 · May 2, 2012, 3:46pm

You know, I could tell the input you posted was edited and reduced, so I was very very careful to make sure my code could include things you didn't ask for

I guess I guessed in the wrong direction.

# Field separator is two or more spaces
BEGIN { FS="   *";      OFS=",";        ORS="\r\n"      }
# Ignore first three lines, then store in array, keep order in O
NR>3 { sub(/ *$/, ""); A[$1]=$2; O[++L]=$1 }

END {
        print "Total Capacity", "Utilization", "Used Capacity";
        print A["Total Capacity"], A["Utilization"], A["Used Capacity"];
}

rrb2009 · May 2, 2012, 4:28pm

Thanks once again but it is not printing the values of these 3. sorry to keep you bugging

Corona688 · May 2, 2012, 4:38pm

What exactly is it doing, then?

...how about you post your actual input so I can write a script that deals with that?

rrb2009 · May 2, 2012, 4:52pm

Hi,

This is the acutual output of the command.

Attribute                        Value
------------------------- ----------------------------
State                            Full Access
Active sessions                  0
Total capacity                   7.8 TB
Capacity used                    3.2 TB
Server utilization               40.8%
Bytes protected                  0 bytes
Bytes protected quota            Not configured
License expiration               Never
Time since Server initialization 203 days 21h:42m
Last checkpoint                  2012-05-02 12:45:10 PDT
Last validated checkpoint        2012-05-02 11:04:48 PDT

I need output like

Total Capacity  Utilization  Used Capacity

7.8 TB               40.2 %      3.2 TB

But right now it is printing only like these

Total Capacity ,Utilization ,Used Capacity,,,

Corona688 · May 2, 2012, 5:07pm

Yes, that's completely different than the output you originally posted. I'll adjust my script for it.

Corona688 · May 2, 2012, 5:14pm

The output you want, by the way, can't be used for import into excel... how would it tell the difference between cell-separating spaces and word-separating spaces? So I've given you what you originally asked for, CSV, as in "Comma-separated value". Excel can import that easy.

# Field separator is two or more spaces
BEGIN { FS="   *";      OFS=",";        ORS="\r\n"      }
# Ignore first two lines, then store in array
NR>2 { sub(/ *$/, ""); A[$1]=$2; }

END {
        print "Total Capacity", "Utilization", "Used Capacity";
        print A["Total capacity"], A["Server utilization"], A["Capacity used"];
}

rrb2009 · May 2, 2012, 5:24pm

Thanks a ton .. It works like a charm .... Can you explain me the script when you have time or give me a link where I can read more about writing awk scripts ?

Corona688 · May 2, 2012, 6:39pm

The basic principle is that awk is one big loop of its own. It runs each outside code-block found in the script, in order, for every line it processes. You can put expressions in front of them to control when they're run, store things in variables, and so on.

It also understands columns. $1, $2, $3 and so forth refer to columns. Usually, it figures that any whitespace splits a column, but here I've told it specially that it has to be two or more spaces ( FS=" *" ). You can also transform what columns it prints out (OFS=",") and what splits lines on the way out (ORS="\r\n").

And a lot more. Even the meaning of 'line' can be altered, if you needed to use "*" instead of \n.

It has associative arrays, so you can do things like A["string"]=32; print A["string"]; and get 32 out of it.

You can even do shell-like redirection in it, in case you wanted to write to anything but stdout. print > "filename" Once you start writing to a file, it stays opened.

# The 'begin' section is run before any files are opened by awk.
# Use it to set up these variables.
BEGIN { FS="   *";      OFS=",";        ORS="\r\n"      }

# NR is the number of records.
# The code block here will only get run when NR is greater than two.
# Ergo, it ignores the first two lines.
NR>2 {
        # Get rid of some spaces at the end of the line
        sub(/ *$/, "");
        # Store like A["Total capacity"]=7.8 TB
        A[$1]=$2; }

# This code block only gets run when awk runs out of files.
END {
        # Print our titles
        print "Total Capacity", "Utilization", "Used Capacity";
        # Print the various things we stored in the array before
        print A["Total capacity"], A["Server utilization"], A["Capacity used"];
}