I have a large CSV files (e.g. 2 million records) and am hoping to do one of two things. I have been trying to use awk and sed but am a newbie and can't figure out how to get it to work. Any help you could offer would be greatly appreciated - I'm stuck trying to remove the colon and wildcards in sed, and the average sample I've found using awk is giving me values of around 4e08.
I am hoping to either use sed or another script to remove the seconds portion of the data lines (i.e. remove ":10 AM" and all similar occurrences, or preferably to use awk to average the flow rates for each minute or each 15 minutes (i.e. the column right after the time).
Thanks for your prompt response, but it looks like I don't have nawk (I'm running Mac OS X). I'll see if I can get it through MacPorts and try again, but if there's any help that can be offered using awk, sed, or tr I know that I have those at my disposal.
EDIT: Installed nawk, and it worked like a charm. Thank you very much.
Thanks Ahmad. I tried the awk code (which I think needs an extra } to close out the for loop?), but I think that might be calculating something else. I am trying to get the average flow (column three) for each minute (or each 15 minute span) of each day. I am not sure I understand the code, but from the output it looks like it is gathering each days worth of records, and dividing them by the number of days?
I don't mean to be a bother, but can you tell me if this is what is going on?