Extract portion of data

Hi Gurus,

I need some help in extracting some of these information and massage it into the desired output as shown below.

I need to extract the last row with the header in below sample which is usually the most recent date, for example:
2012-06-01 142356 mb 519 -219406 mb 1 -77049 mb

Notice in the sampe output below, there a new column called: "Label" being added.

FYI, this is just a sample data, the actual data will might contains more data and the listing will be longer.

Sample data:

Date          New Data #AB       Removed #CD    Net Change
----------  ---------- -----  ---------- -----  ----------
2012-05-27   100046 mb 580    -236329 mb 1      -136282 mb
2012-05-28   117905 mb 628    -223216 mb 1      -105310 mb
2012-05-29   153561 mb 706    -214508 mb 1       -60946 mb
2012-05-30   216141 mb 629    -222977 mb 1        -6835 mb
2012-05-31   180524 mb 468    -221210 mb 1       -40685 mb
2012-06-01   142356 mb 519    -219406 mb 1       -77049 mb
----------  ---------- -----  ---------- -----  ----------
Average      157957 mb        -158746 mb           -789 mb

Top 10 Servers Detected:
------------------------
Total for all servers                  4106890 mb      100.0%
  server1.abc.com	               1298172 mb      31.6%
  server2.abc.com                      708845 mb       17.3%

Sample output:

Label	     Date        New Data   #AB    Removed    #CD    Net Change
-------	     ----------  ---------- -----  ---------- -----  ----------
Statistic    2012-06-01  142356 mb  519    -219406 mb 1       -77049 mb

Is it possible to generate this kind of output?

Would appreciate for any of you help and advise.

Thank you.

  • Jack

Try:

awk 'NR==1{print "Label\t\t" $0}NR==2{print "-------\t\t" $0} NR>2 && /^--/{print "Statistic\t"p; exit}{p=$0}' infile
1 Like

Hi Scrutinizer,

Thanks for your response and help.

I've tested and it's working as expected. Great :slight_smile:

However, I've just realised that there are a couple of data that I need to compile in my script and append the similar data from other sources.

Is it possible to extract the last row ONLY without the header and dash line in below sample? The last row usually indicates the most recent date.

For example:

2012-06-01 142356 mb 519 -219406 mb 1 -77049 mb

Greatly appreciate for your advise.

Thank you.

  • Jack

Given your sample input, this should work:

awk ' /^[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]/ { k = $0 } END { print k }' t49.data

It will have problems if other lines (after the section you indicated) have a date in the first field.

Or stripping the script in #2 would give:

awk 'NR>2 && /^--/{print p; exit}{p=$0}' infile
1 Like

Hi Scrutinizer,

Thanks for your response.

I've tried the second awk and it's working OK.

However, I forgot to mention that there is a need to insert the word: "Statistic".

So, the correct output should be:

Statistic    2012-06-01  142356 mb  519    -219406 mb 1       -77049 mb

The spacing and alignment of this output should be aligned with the first output which includes the header and dash so that it's easier to append its output to the next line.

Could you advise on how to include this word in the second awk?

Thanks a lot.

  • Jack

Hi Jack,

I think that if you compare #5 to #2 , you should be able to make this change yourself, no?

S.

1 Like

Hi Scrutinizer,

I've observed that when this output is sent out via email, all the alignment becomes un-organized, out of alignment.

Is there any better way to manage this output alignment?

Thanks a lot.

  • Jack

How are you sending it out? You can send it as a text attachment. In any case the text needs to be displayed with a fixed width font. It could be that either the mail program with with you send, or the one with which you read the email introduces a propotional font for display, which would mess up the the way the text is displayed...

Hi Scrutinizer,

I'm using this method (/usr/sbin/sendEmail) to send out the output.

Any advise?

Thanks.

  • Jack

---------- Post updated at 08:00 AM ---------- Previous update was at 07:58 AM ----------

Hi Scrutinizer,

Got it. Managed to get the output after comparing #5 to #2 .

Thanks.

  • Jack

---------- Post updated at 10:04 PM ---------- Previous update was at 08:00 AM ----------

Hi All

Would like to seek your help if it's possible to compile the following output with this alignment? Hopefully it will have a better format when it's sending out via email.

Label		Date		 New Data   #AB	      Removed    #CD      Net Change
Statistic	2012-06-03    	 21807 mb   206       -46503 mb  1        -24695 mb
Statistic	2012-06-04       28 mb      1         0 mb                28 mb
Statistic	2012-06-03    	 13926 mb   222       0 mb                13926 mb
Statistic	2012-06-03   	 166624 mb  704       0 mb                166624 mb

Appreciate for any of your help.

Thank you.

  • Jack