I have files that store multiple data points for the same device "vertically" and include multiple devices. It repeats a consistant pattern of lines where for each line:
Column 1 is a common number for the entire file and all devices in that file
Column 2 is a unique device number
Column 3 is a unique identifier of the data point included on that line
Column 4 is the unique data point
# cat myfile.csv
x,y1,a,name1
x,y1,b,2.5
x,y1,c,4
x,y2,a,name2
x,y2,b,3
x,y2,c,5.5
x,y3,a,name3
x,y3,b,1
x,y3,c,2
So above I have three devices (y1, y2 and y3) that all have three data points (a, b and c). One of the data points is a unique name, so I can discard $1,$2,$3 and I only want to retain $4. What I want to do is flatten the three data points into a single line per device:
name1,2.5,4
name2,3,5.5
name3,1,2
I have found a way to take a given set of lines, awk print $4 and insert them on the same line
# sed -n 1,3p myfile.csv | awk -F"," '{print $4","}' | tr -d '\n'
name1,2.5,4
But I need a loop to continue processing the next "x" lines.
Above is a simple view of what I'm trying to do. My files have 53 data points for every device. And the number of devices is "random". Therefore my loop that "ingests" 53 lines at a time and then spits them out on a single line needs to continue until the file is complete (do ; done < $1 ?). For example one file is 312,912 lines (5,904 devices x 53 data points) another is 318,000 lines (6,000 devices x 53 data points). Using sed I can do what I need to do on the first 53 lines of the file, but now I just need to insert it into a loop.
# sed -n 1,53p myfile.csv | awk -F"," '{print $4","}' | tr -d '\n'
Any help would be greatly appreciated.
Signed,
Sleepless in Seattle