Format columns

I have this format of columns

Quote

2 Points:
  np           x               y             z
   0        767203.9         2120710     917.2959
   1        767071.6         2120658     2793.661

Surface Polyline
Color: (0.0400229 1 0.845803)
2 Points:
  np           x               y             Depth
   0        767216.5         2120580     917.2959
   1        766846.8         2120435     2779.101

Surface Polyline
Color: (0.0400229 1 0.845803)
3 Points:
  np           x               y             Depth
   0        767160.6         2120424     909.7954
   1        766825.5         2120292      2767.58
   2        766825.5         2120292     2793.502

............................................................
Unquote

I would like to convert the values into

   x                   y        group  z
  767203.9         2120710  0   917.2959
  767071.6         2120658  0   2793.661
  767216.5         2120580  1   917.2959
  766846.8         2120435  1   2779.101
  767160.6         2120424  2   909.7954
  766825.5         2120292  2   2767.58
  766825.5         2120292  2   2793.502

The x and y followed by the no of points as a group (e.g 2 points in first 2 rows become group 0, 2 points in 2nd row as group 1, 3 points in 3rd row so group name 3.....etc).

I could successfully put x y and z values but while looping thru np values I am stuck.

any idea?

Don't forget the code tag.

 awk 'BEGIN{i=0;print "x\t\ty\tgroup\tz"}/^$/{i++} $1~/^[0-9]/&&NF>3 {print $2,$3,i,$4 }' OFS="\t" infile

Hi rd

almost done. Problem is in the grouping still as you may have noticed if np occurs 3 times then all the three rows become the same group e.g if np occures 4 times i say row 8,9,10,11 then the group id for those four rows will be the same say 8 (entrey four times).

I did the awk you suggest and this was the o/p

x               y       group   z
752670.2\t2150082\t2\t2777.672
752729.8\t2149930\t2\t904.061
752786.5\t2150128\t3\t913.8535
752726.9\t2150280\t3\t2784.2
752783.6\t2150477\t4\t959.5513
752902.9\t2150173\t4\t2807.049
752899.9\t2150523\t5\t904.061
753019.2\t2150219\t5\t2774.407
752897\t2150872\t6\t923.6458

(Occurence of group id only twice according to the above output)

Any suggestion pls?

Not understand your problem.

Seems your awk don't know OFS, then try this code:

awk '
BEGIN{i=0;print "x\t\ty\tgroup\tz"}
/^$/{i++}
$1~/^[0-9]/&&NF>3 {printf $2"\t"$3"\t"i"\t"$4"\n" }
' infile
x               y       group   z
767203.9        2120710 0       917.2959
767071.6        2120658 0       2793.661
767216.5        2120580 1       917.2959
766846.8        2120435 1       2779.101
767160.6        2120424 2       909.7954
766825.5        2120292 2       2767.58
766825.5        2120292 2       2793.502

So with your sample, I see three np line, so the group id start from 0 to 2.

Any problem?

1 Like
local $/="\n\n";
while(<DATA>){
  my @tmp = split("\n",$_);
  foreach(@tmp){
    if(!/[a-zA-Z]/){
    my @t = split;
    print $t[1]," ",$t[2]," ",$.-1," ",$t[3],"\n";
}
  }
}
__DATA__
2 Points:
  np           x               y             z
   0        767203.9         2120710     917.2959
   1        767071.6         2120658     2793.661

Surface Polyline
Color: (0.0400229 1 0.845803)
2 Points:
  np           x               y             Depth
   0        767216.5         2120580     917.2959
   1        766846.8         2120435     2779.101

Surface Polyline
Color: (0.0400229 1 0.845803)
3 Points:
  np           x               y             Depth
   0        767160.6         2120424     909.7954
   1        766825.5         2120292      2767.58
   2        766825.5         2120292     2793.502

USE this:

awk 'BEGIN{OFS="\t";i=1;print "x\t\t y\tgroup\t\tz"} /^$/ {i++}$1~/[0-9]/ && $2 ~/[0-9]/ {print $2,$3,$4,i}' file

Thanks to Rahul, Summer and rdc:

Problem is np occurs not 3 times it may occur 4 times with 0 1 2 3 also in the pattern of rows.
Original quoted input files is being cut only for space constraint. Rahul's suggestion also o/p the same result as rdc where grouping is also only limted upto two rows (here occurences of 0 1 2....)

Pls guy throw some light? I am nawking?

---------- Post updated at 03:35 PM ---------- Previous update was at 03:07 PM ----------

Hi rdcwayx,

Gre8t of that....mod on the awk. Yeah I re-ran the script. I think it works. Tho problem still in the group. I wud like the grouping to begin always from 0 (say if np occurs thrice in the first 3 rows) then the group for those 3 rows will be 0. Next occurences of np (say two times then those 2 rows will have group id as 1 etc).

Great idea......n thanks a helluva

Can you give the sample which np occur 4 times or more?