I have a data file that I tail and dump the data into a new file at midnight.
To work I had to use tail -n2 livefile.csv >> storefile.csv. This is ging me 2 or 3 entries at a time into the storefile.csv.
if I continue this then I will have to run a second pass on the file to remove the duplicates from the csv file. all the files start with the date.
I only need one entry per night.
What is the best way to do this ? I have 6 files to do this to each night.
awk's associative arrays neither are stored in any particular order or accessed in any particular order. Here's the trick 'round this - assuming your input file comes sorted by date/time:
awk -F"," '!($1 in A) {key[++key[0]]=$1} {A[$1]=$0}END{for (i=1;i<=key[0];i++){print A[key]}}' filename