Hi All,
I am working on a huge pile of data. Its about a service initiating a call and and finishing it.
The columns that I have is:
Date Time URL Call Type ServiceName
5/11/2015 15:00:00 http-/0.0.0.0:8091-9 going to call queryPrepaidBalanceAndThresholdInfo
5/11/2015 15:00:15 http-/0.0.0.0:8091-22 going to call queryPrepaidBalanceAndThresholdInfo
5/11/2015 15:00:17 http-/0.0.0.0:8091-6 going to call queryPrepaidBalanceAndThresholdInfo
5/11/2015 15:00:01 http-/0.0.0.0:8091-9 finished call queryPrepaidBalanceAndThresholdInfo
5/11/2015 15:00:16 http-/0.0.0.0:8091-22 finished call queryPrepaidBalanceAndThresholdInfo
5/11/2015 15:00:17 http-/0.0.0.0:8091-6 finished call queryPrepaidBalanceAndThresholdInfo
The end result I want is:
5/11/2015 15:00:00 http-/0.0.0.0:8091-9 going to call queryPrepaidBalanceAndThresholdInfo
5/11/2015 15:00:01 http-/0.0.0.0:8091-9 finished call queryPrepaidBalanceAndThresholdInfo
5/11/2015 15:00:15 http-/0.0.0.0:8091-22 going to call queryPrepaidBalanceAndThresholdInfo
5/11/2015 15:00:16 http-/0.0.0.0:8091-22 finished call queryPrepaidBalanceAndThresholdInfo
5/11/2015 15:00:17 http-/0.0.0.0:8091-6 going to call queryPrepaidBalanceAndThresholdInfo
5/11/2015 15:00:18 http-/0.0.0.0:8091-6 finished call queryPrepaidBalanceAndThresholdInfo
For each Service Name, I would like to have its corresponding "going to call" and "finished call" against URL sorted out.
Is there any command that can help me in this regard ?
Or does this require a complex script ?
Regard
cjcox
May 11, 2015, 5:09pm
2
Hint, sort on 1,2,3 ascending and 4 descending.
Assume test2.txt contains the data lines...
sort -k1,3 -k4,4r <test2.txt
produces...
5/11/2015 15:00:00 http-/0.0.0.0:8091-9 going to call queryPrepaidBalanceAndThresholdInfo
5/11/2015 15:00:01 http-/0.0.0.0:8091-9 finished call queryPrepaidBalanceAndThresholdInfo
5/11/2015 15:00:15 http-/0.0.0.0:8091-22 going to call queryPrepaidBalanceAndThresholdInfo
5/11/2015 15:00:16 http-/0.0.0.0:8091-22 finished call queryPrepaidBalanceAndThresholdInfo
5/11/2015 15:00:17 http-/0.0.0.0:8091-6 going to call queryPrepaidBalanceAndThresholdInfo
5/11/2015 15:00:17 http-/0.0.0.0:8091-6 finished call queryPrepaidBalanceAndThresholdInfo
Actually since you're just interested in url and not date, I'll assume the records can come in mixed... so really:
sort -k3,3 -k4,4r <test2.txt
1 Like
Hi All,
I am working on a huge pile of data. Its about a service initiating a call and and finishing it.
The columns that I have is:
Date Time URL Call Type ServiceName
5/11/2015 15:00:00 http-/0.0.0.0:8091-9 going to call queryPrepaidBalanceAndThresholdInfo
5/11/2015 15:00:15 http-/0.0.0.0:8091-22 going to call queryPrepaidBalanceAndThresholdInfo
5/11/2015 15:00:17 http-/0.0.0.0:8091-6 going to call queryPrepaidBalanceAndThresholdInfo
5/11/2015 15:00:01 http-/0.0.0.0:8091-9 finished call queryPrepaidBalanceAndThresholdInfo
5/11/2015 15:00:16 http-/0.0.0.0:8091-22 finished call queryPrepaidBalanceAndThresholdInfo
5/11/2015 15:00:17 http-/0.0.0.0:8091-6 finished call queryPrepaidBalanceAndThresholdInfo
The end result I want is:
5/11/2015 15:00:00 http-/0.0.0.0:8091-9 going to call queryPrepaidBalanceAndThresholdInfo
5/11/2015 15:00:01 http-/0.0.0.0:8091-9 finished call queryPrepaidBalanceAndThresholdInfo
5/11/2015 15:00:15 http-/0.0.0.0:8091-22 going to call queryPrepaidBalanceAndThresholdInfo
5/11/2015 15:00:16 http-/0.0.0.0:8091-22 finished call queryPrepaidBalanceAndThresholdInfo
5/11/2015 15:00:17 http-/0.0.0.0:8091-6 going to call queryPrepaidBalanceAndThresholdInfo
5/11/2015 15:00:18 http-/0.0.0.0:8091-6 finished call queryPrepaidBalanceAndThresholdInfo
For each Service Name, I would like to have its corresponding "going to call" and "finished call" against URL sorted out.
Is there any command that can help me in this regard ?
Or does this require a complex script ?
Regard
I don't know how you get output with an 18
in the seconds field in the output when that value does not appear in any line in your input.
It isn't clear if you are saying that your input file might contain more than one ServiceName
field value (since your sample input only has one value in that field). If there is more than one value for that field, it isn't clear whether you want each value in a separate output file or just sorted in one output file.
Now that I have added CODE tags to your post (so we can see that you have tabs as field separators instead of spaces), and assuming that you really want to get rid of the header line from your input and want all of the output in one file when there is more than one ServiceName
value, you could try something like:
#!/bin/ksh
{ IFS= read header
sort -t " " -k5,5 -k3,3 -k4,4r
} < file
(where the character between the double quotes is a single literal tab character). This was tested using the Korn shell, but will work with any shell based on Bourne shell syntax.With the sample input you provided, this produces the output:
5/11/2015 15:00:15 http-/0.0.0.0:8091-22 going to call queryPrepaidBalanceAndThresholdInfo
5/11/2015 15:00:16 http-/0.0.0.0:8091-22 finished call queryPrepaidBalanceAndThresholdInfo
5/11/2015 15:00:17 http-/0.0.0.0:8091-6 going to call queryPrepaidBalanceAndThresholdInfo
5/11/2015 15:00:17 http-/0.0.0.0:8091-6 finished call queryPrepaidBalanceAndThresholdInfo
5/11/2015 15:00:00 http-/0.0.0.0:8091-9 going to call queryPrepaidBalanceAndThresholdInfo
5/11/2015 15:00:01 http-/0.0.0.0:8091-9 finished call queryPrepaidBalanceAndThresholdInfo
If you want the header to be copied into the output instead of removing it; add the line:
printf '%s\n' "$header"
just before the line:
sort -t " " -k5,5 -k3,3 -k4,4r
in the script.
1 Like
This solved my problem. Thank You so much !