best practice please

Mikey · June 10, 2009, 2:40pm

I have a data file that I tail and dump the data into a new file at midnight.
To work I had to use tail -n2 livefile.csv >> storefile.csv. This is ging me 2 or 3 entries at a time into the storefile.csv.
if I continue this then I will have to run a second pass on the file to remove the duplicates from the csv file. all the files start with the date.
I only need one entry per night.

What is the best way to do this ? I have 6 files to do this to each night.

Thank you

Mike

i.e.

06/06/2009,23:56:04,22.8
06/06/2009,23:56:09,22.8
06/06/2009,23:56:14,22.8
06/07/2009,23:56:09,8.5
06/07/2009,23:56:14,8.5
06/08/2009,23:56:09,-2.5
06/08/2009,23:56:14,-2.5
06/09/2009,23:56:09,-6.9
06/09/2009,23:56:14,-6.9

should be
06/06/2009,23:56:14,22.8
06/07/2009,23:56:14,8.5
06/08/2009,23:56:14,-2.5
06/09/2009,23:56:14,-6.9

vidyadhar85 · June 10, 2009, 2:47pm

this will do

awk -F"," '{A[$1]=$0}END{for ( i in A) {print A}}' filename

Mikey · June 10, 2009, 3:25pm

It reversed the order of all the dates in the file.

Now they are all in there starting yesterday and ending in begining.
Its kind of funny looking at the graphs backwards.

-----Post Update-----

I'm not sure if its right or not but this does work

sort /var/www/solar_month3.csv >> /var/www/solar_month2.csv
tail -n2 /var/www/rec_month.csv >> /var/www/rec_month2.csv
awk -F"," '{A[$1]=$0}END{for ( i in A) {print A[i]}}' /var/www/rec_month2.csv >> /var/www/rec_month3.csv
rm /var/www/rec_month2.csv
sort /var/www/rec_month3.csv >> /var/www/rec_month2.csv
#

vgersh99 · June 10, 2009, 3:59pm

awk's associative arrays neither are stored in any particular order or accessed in any particular order. Here's the trick 'round this - assuming your input file comes sorted by date/time:

awk -F"," '!($1 in A) {key[++key[0]]=$1} {A[$1]=$0}END{for (i=1;i<=key[0];i++){print A[key]}}' filename

Mikey · June 10, 2009, 5:43pm

Took out all of my extras that weren't needed