Sorting issue

Hi All,

I am working on a huge pile of data. Its about a service initiating a call and and finishing it.

The columns that I have is:

Date          Time         URL                           Call Type           ServiceName
5/11/2015	15:00:00	http-/0.0.0.0:8091-9	going to call	queryPrepaidBalanceAndThresholdInfo
5/11/2015	15:00:15	http-/0.0.0.0:8091-22	going to call	queryPrepaidBalanceAndThresholdInfo
5/11/2015	15:00:17	http-/0.0.0.0:8091-6	going to call	queryPrepaidBalanceAndThresholdInfo
5/11/2015	15:00:01	http-/0.0.0.0:8091-9	finished call	queryPrepaidBalanceAndThresholdInfo
5/11/2015	15:00:16	http-/0.0.0.0:8091-22	finished call	queryPrepaidBalanceAndThresholdInfo
5/11/2015	15:00:17	http-/0.0.0.0:8091-6	finished call	queryPrepaidBalanceAndThresholdInfo

The end result I want is:

5/11/2015	15:00:00	http-/0.0.0.0:8091-9	going to call	queryPrepaidBalanceAndThresholdInfo
5/11/2015	15:00:01	http-/0.0.0.0:8091-9	finished call	queryPrepaidBalanceAndThresholdInfo
5/11/2015	15:00:15	http-/0.0.0.0:8091-22	going to call	queryPrepaidBalanceAndThresholdInfo
5/11/2015	15:00:16	http-/0.0.0.0:8091-22	finished call	queryPrepaidBalanceAndThresholdInfo
5/11/2015	15:00:17	http-/0.0.0.0:8091-6	going to call	queryPrepaidBalanceAndThresholdInfo
5/11/2015	15:00:18	http-/0.0.0.0:8091-6	finished call	queryPrepaidBalanceAndThresholdInfo

For each Service Name, I would like to have its corresponding "going to call" and "finished call" against URL sorted out.

Is there any command that can help me in this regard ?

Or does this require a complex script ?

Regard

Hint, sort on 1,2,3 ascending and 4 descending.
Assume test2.txt contains the data lines...

sort -k1,3 -k4,4r <test2.txt

produces...

5/11/2015 15:00:00 http-/0.0.0.0:8091-9 going to call queryPrepaidBalanceAndThresholdInfo
5/11/2015 15:00:01 http-/0.0.0.0:8091-9 finished call queryPrepaidBalanceAndThresholdInfo
5/11/2015 15:00:15 http-/0.0.0.0:8091-22 going to call queryPrepaidBalanceAndThresholdInfo
5/11/2015 15:00:16 http-/0.0.0.0:8091-22 finished call queryPrepaidBalanceAndThresholdInfo
5/11/2015 15:00:17 http-/0.0.0.0:8091-6 going to call queryPrepaidBalanceAndThresholdInfo
5/11/2015 15:00:17 http-/0.0.0.0:8091-6 finished call queryPrepaidBalanceAndThresholdInfo

Actually since you're just interested in url and not date, I'll assume the records can come in mixed... so really:

sort -k3,3 -k4,4r <test2.txt
1 Like

I don't know how you get output with an 18 in the seconds field in the output when that value does not appear in any line in your input.

It isn't clear if you are saying that your input file might contain more than one ServiceName field value (since your sample input only has one value in that field). If there is more than one value for that field, it isn't clear whether you want each value in a separate output file or just sorted in one output file.

Now that I have added CODE tags to your post (so we can see that you have tabs as field separators instead of spaces), and assuming that you really want to get rid of the header line from your input and want all of the output in one file when there is more than one ServiceName value, you could try something like:

#!/bin/ksh
{	IFS= read header
	sort -t "	" -k5,5 -k3,3 -k4,4r
} < file

(where the character between the double quotes is a single literal tab character). This was tested using the Korn shell, but will work with any shell based on Bourne shell syntax.With the sample input you provided, this produces the output:

5/11/2015	15:00:15	http-/0.0.0.0:8091-22	going to call	queryPrepaidBalanceAndThresholdInfo
5/11/2015	15:00:16	http-/0.0.0.0:8091-22	finished call	queryPrepaidBalanceAndThresholdInfo
5/11/2015	15:00:17	http-/0.0.0.0:8091-6	going to call	queryPrepaidBalanceAndThresholdInfo
5/11/2015	15:00:17	http-/0.0.0.0:8091-6	finished call	queryPrepaidBalanceAndThresholdInfo
5/11/2015	15:00:00	http-/0.0.0.0:8091-9	going to call	queryPrepaidBalanceAndThresholdInfo
5/11/2015	15:00:01	http-/0.0.0.0:8091-9	finished call	queryPrepaidBalanceAndThresholdInfo

If you want the header to be copied into the output instead of removing it; add the line:

	printf '%s\n' "$header"

just before the line:

	sort -t "	" -k5,5 -k3,3 -k4,4r

in the script.

1 Like

This solved my problem. Thank You so much !