I have a huge file, I am currently using while loop to read and do some calculation on it, but it is taking a lot of time.
I want to use AWK to read and do those calculations.
Please suggest.
Hi, it is mainly so slow, because of the way of coding. awk is faster than shell, but what is goiing to make a much bigger difference is that there are many external program calls within the loop, that are called for every line in input2 and also for every line, every CONFLOG*.txt in the directory is grepped.
And all this using intermediate files, pipelines and subshells, so for every iteration of the loop, many subshells are being started...
Could you provide some samples/structure of your input files and desired output? Please use code tags for code, samples and output
input file contain 40K records and cCONFLOG contain 65lacs record
I have written a part of code, after getting str1 and str2 i am taking the average of time mentioned in column1 of these strings
shell code is taking 5 hours for this
the sample file is like:
input2:
2012-12-04 00:00:02 info dmsc: New dms_event on 919782458587
2012-12-04 00:00:02 minor dmsc: Deliver message sms_intro multi (sched 1) (prio 3) to "919782458587"
2012-12-04 00:00:02 info dmsc: New dms_event on 918823938486
2012-12-04 00:00:02 info dmsc: New dms_event on 918561022848
2012-12-04 00:00:02 minor dmsc: Deliver message sms_intro single (sched 1) (prio 3) to "918561022848"
2012-12-04 00:00:02 info dmsc: New dms_event on 918386870512
2012-12-04 00:00:02 minor dmsc: Deliver message sms_intro multi (sched 1) (prio 3) to "918386870512"
2012-12-04 00:00:06 info dmsc: New dms_event on 918947018661
2012-12-04 00:00:06 minor dmsc: Deliver message sms_intro multi (sched 1) (prio 3) to "918947018661"
2012-12-04 00:00:07 info dmsc: New dms_event on 919722340715
2012-12-04 00:00:07 info dmsc: New dms_event on 919782406184
2012-12-04 00:00:07 minor dmsc: Deliver message sms_intro single (sched 1) (prio 3) to "919722340715"
2012-12-04 00:00:07 minor dmsc: Deliver message sms_intro single (sched 1) (prio 3) to "919782406184"
2012-12-04 00:00:07 info dmsc: New dms_event on 919802991503
2012-12-04 00:00:07 minor dmsc: Deliver message sms_dmscmd single (sched 0) (prio 0) to "919802991503"
2012-12-04 00:00:07 info dmsc: New dms_event on 919166875009
2012-12-04 00:00:07 minor dmsc: Deliver message sms_intro single (sched 1) (prio 3) to "919166875009"
2012-12-04 00:00:07 info dmsc: New dms_event on 918561078910
str1 and str2 will have some strings then i will extract the time value from them and subtract the time to get the time taken by a number to change its value from str1 to str2.
but my main concern is that, the part of code that i had mentioned is taking a lot of time, is there any alternate like key and hash of perl to make data parsing fast.
The sample you provided is not good one, no one can be matched. For simple start, below script can be used to get the keyword (Col 1) 's "New dms_event" and "Deliver" time:
awk 'NR==FNR{a[$1];next} {gsub(/\"/,"",$NF);if ($NF in a) print $NF,$1,$2}' input2 CONFLOG*.txt
let us know, what will you do for next these times? get the different timing between the times?
For example, if you get below result, what's your expect?