Format columns

cml · September 13, 2010, 2:08am

I have this format of columns

Quote

2 Points:
  np           x               y             z
   0        767203.9         2120710     917.2959
   1        767071.6         2120658     2793.661

Surface Polyline
Color: (0.0400229 1 0.845803)
2 Points:
  np           x               y             Depth
   0        767216.5         2120580     917.2959
   1        766846.8         2120435     2779.101

Surface Polyline
Color: (0.0400229 1 0.845803)
3 Points:
  np           x               y             Depth
   0        767160.6         2120424     909.7954
   1        766825.5         2120292      2767.58
   2        766825.5         2120292     2793.502

............................................................
Unquote

I would like to convert the values into

   x                   y        group  z
  767203.9         2120710  0   917.2959
  767071.6         2120658  0   2793.661
  767216.5         2120580  1   917.2959
  766846.8         2120435  1   2779.101
  767160.6         2120424  2   909.7954
  766825.5         2120292  2   2767.58
  766825.5         2120292  2   2793.502

The x and y followed by the no of points as a group (e.g 2 points in first 2 rows become group 0, 2 points in 2nd row as group 1, 3 points in 3rd row so group name 3.....etc).

I could successfully put x y and z values but while looping thru np values I am stuck.

any idea?

rdcwayx · September 13, 2010, 2:21am

Don't forget the code tag.

 awk 'BEGIN{i=0;print "x\t\ty\tgroup\tz"}/^$/{i++} $1~/^[0-9]/&&NF>3 {print $2,$3,i,$4 }' OFS="\t" infile

cml · September 13, 2010, 7:53am

Hi rd

almost done. Problem is in the grouping still as you may have noticed if np occurs 3 times then all the three rows become the same group e.g if np occures 4 times i say row 8,9,10,11 then the group id for those four rows will be the same say 8 (entrey four times).

I did the awk you suggest and this was the o/p

x               y       group   z
752670.2\t2150082\t2\t2777.672
752729.8\t2149930\t2\t904.061
752786.5\t2150128\t3\t913.8535
752726.9\t2150280\t3\t2784.2
752783.6\t2150477\t4\t959.5513
752902.9\t2150173\t4\t2807.049
752899.9\t2150523\t5\t904.061
753019.2\t2150219\t5\t2774.407
752897\t2150872\t6\t923.6458

(Occurence of group id only twice according to the above output)

Any suggestion pls?

rdcwayx · September 13, 2010, 9:53pm

Not understand your problem.

Seems your awk don't know OFS, then try this code:

awk '
BEGIN{i=0;print "x\t\ty\tgroup\tz"}
/^$/{i++}
$1~/^[0-9]/&&NF>3 {printf $2"\t"$3"\t"i"\t"$4"\n" }
' infile
x               y       group   z
767203.9        2120710 0       917.2959
767071.6        2120658 0       2793.661
767216.5        2120580 1       917.2959
766846.8        2120435 1       2779.101
767160.6        2120424 2       909.7954
766825.5        2120292 2       2767.58
766825.5        2120292 2       2793.502

So with your sample, I see three np line, so the group id start from 0 to 2.

Any problem?

summer_cherry · September 13, 2010, 10:06pm

local $/="\n\n";
while(<DATA>){
  my @tmp = split("\n",$_);
  foreach(@tmp){
    if(!/[a-zA-Z]/){
    my @t = split;
    print $t[1]," ",$t[2]," ",$.-1," ",$t[3],"\n";
}
  }
}
__DATA__
2 Points:
  np           x               y             z
   0        767203.9         2120710     917.2959
   1        767071.6         2120658     2793.661

Surface Polyline
Color: (0.0400229 1 0.845803)
2 Points:
  np           x               y             Depth
   0        767216.5         2120580     917.2959
   1        766846.8         2120435     2779.101

Surface Polyline
Color: (0.0400229 1 0.845803)
3 Points:
  np           x               y             Depth
   0        767160.6         2120424     909.7954
   1        766825.5         2120292      2767.58
   2        766825.5         2120292     2793.502

RahulJoshi · September 13, 2010, 10:30pm

USE this:

awk 'BEGIN{OFS="\t";i=1;print "x\t\t y\tgroup\t\tz"} /^$/ {i++}$1~/[0-9]/ && $2 ~/[0-9]/ {print $2,$3,$4,i}' file

cml · September 14, 2010, 6:05am

Thanks to Rahul, Summer and rdc:

Problem is np occurs not 3 times it may occur 4 times with 0 1 2 3 also in the pattern of rows.
Original quoted input files is being cut only for space constraint. Rahul's suggestion also o/p the same result as rdc where grouping is also only limted upto two rows (here occurences of 0 1 2....)

Pls guy throw some light? I am nawking?

---------- Post updated at 03:35 PM ---------- Previous update was at 03:07 PM ----------

Hi rdcwayx,

Gre8t of that....mod on the awk. Yeah I re-ran the script. I think it works. Tho problem still in the group. I wud like the grouping to begin always from 0 (say if np occurs thrice in the first 3 rows) then the group for those 3 rows will be 0. Next occurences of np (say two times then those 2 rows will have group id as 1 etc).

Great idea......n thanks a helluva

rdcwayx · September 14, 2010, 6:05am

Can you give the sample which np occur 4 times or more?