Help in printing records where there is a 'header' in the first record ???

newbie_01 · December 28, 2017, 1:01am

Hi,

I have a backup report that unfortunately has some kind of hanging indent thing where the first line contains one column more than the others

I managed to get the output that I wanted using awk, but just wanting to know if there is short way of doing it using the same awk

Below is what I have so far:

This is the content of the report so far, hopefully it doesn't change. I change the information to not show the real information but the format is more or less correct.

$: cat /tmp/x.txt
Some text here
Some header information here
Some header information here
------- ------------- ------- ------------- -------------------- --------- ----
xxx2ns666-pr-non-xxxTEST xx_tmp_abcq xxx2ns111-dr-01h-xxxTEST vol_abcp_zz xxx1ns999_vol_abcp_zz_yy.1 online RO
                         xx_tmp_abct xxx2ns111-dr-01h-xxxTEST vol_abcp_zz xxx1ns999_vol_abcp_zz_yy.16 online RO
                         xx_tmp_defd xxx2ns111-dr-01h-xxxTEST vol_defp_zz xxx1ns999_vol_defp_zz_yy.16 online RO
                         xx_tmp_ghiq xxx2ns111-dr-01h-xxxTEST vol_ghip_zz xxx1ns999_vol_ghip_zz_yy.16 online RO
                         xx_tmp_ghit xxx2ns111-dr-01h-xxxTEST vol_ghip_zz xxx1ns999_vol_ghip_zz_yy.17 online RO
5 entries were displayed.

Below is my long very newbie awk command

cat /tmp/x.txt | grep xx_tmp | grep -v "^$" | awk 'NR==1 { print $2 " " $3 " " $4 " " $5 " " $6 " " $7 } NR>1 { print $1 " " $2 " " $3 " " $4 " " $5 " " $6 }'

xx_tmp_abcq xxx2ns111-dr-01h-xxxTEST vol_abcp_zz xxx1ns999_vol_abcp_zz_yy.1 online RO
xx_tmp_abct xxx2ns111-dr-01h-xxxTEST vol_abcp_zz xxx1ns999_vol_abcp_zz_yy.16 online RO
xx_tmp_defd xxx2ns111-dr-01h-xxxTEST vol_defp_zz xxx1ns999_vol_defp_zz_yy.16 online RO
xx_tmp_ghiq xxx2ns111-dr-01h-xxxTEST vol_ghip_zz xxx1ns999_vol_ghip_zz_yy.16 online RO
xx_tmp_ghit xxx2ns111-dr-01h-xxxTEST vol_ghip_zz xxx1ns999_vol_ghip_zz_yy.17 online RO

The grep -v "^$" is just a sanity thing. Is there a shorter way that does the same?

Don_Cragun · December 28, 2017, 1:41am

The grep -v "^$" is NOT just a sanity thing. It is a waste of system resources that only serves to make your script run slower. Any line that is matched by the first grep in your pipeline can't possibly be an empty line that will be discarded by the second grep in your pipeline. The cat in your pipeline is an unnecessary use of cat . It, again, only serves to make your script run slower.

The following script produces the same output as your script. But, of course, it depends on your sample input being a realistic sample of the actual contents of /tmp/x.txt . If it isn't, all bets are off.

awk 'sub(/^.*xx_tmp/, "xx_tmp")' /tmp/x.txt

RudiC · December 28, 2017, 2:36am

That looks like a query result from a DB created with a special formatting. Any chance to modify the query's format in the first place?