Help on Sorting

richa_240889 · April 25, 2019, 11:31am

Hello Everyone, I need help here . I need to sort a file for one of my requirement , The file has to be sorted using a key with 4 columns. Sorting is working fine on those 4 columns but when the key is matching for many rows the other columns are also getting sorted which is not required .

Giving an example here : File = Input.txt

1 101 100 
1 001 200
1 002 100

Here I am using below command to sort.

Sort -k1,1 -k3  Input.txt

This gives me below output:

1 002  100
1 101 100
1 001 200

I don't want to sort the second column here . I have tried with -s option for stable sort but the same is not supported .

Please help.

nezabudka · April 25, 2019, 12:22pm

Hi, try

sort -Vr

and use option --debug for testing

--- Post updated at 19:22 ---

sort -k1,3.4 --debug

Corona688 · April 25, 2019, 12:32pm

I dealt with something like this before. I had to add an extra column, which held the line number, like:

1 101 100 1
1 001 200 2
1 002 100 3

...so that I could give that column number to sort, forcing groups of lines back into file order after other sorting conditions were satisfied.

$ awk '{ $(NF+1)=NR } 1' < inputfile | sort -k1,1 -k 3 -k 4

1 101 100 1
1 002 100 3
1 001 200 2

$ awk '{ $(NF+1)=NR } 1' < data | sort -k1,1 -k 3 -k 4 | cut -d ' ' -f 1,2,3

1 101 100
1 002 100
1 001 200

$

RudiC · April 25, 2019, 12:46pm

The result you get is exactly as specified - sorted by field 1 (all lines identical), then field 3. I don't see column 2 sorted - it's just random, as its results are unspecified. To keep the original file order once all keys are consumed, try Corona688's approach, or, quite similar:

cat -n file3 | sort -b -k2,2 -k4n -k1,1 | cut -f2-
1 101 100 
1 002 100
1 001 200

The problem with your sample file is a trailing space in line one that influences the key 4 - use -n to overcome.

Scrutinizer · April 25, 2019, 12:55pm

Everyone at the UNIX and Linux Forums gives their best effort to reply to all questions in a timely manner. For this reason, posting questions with subjects like "Urgent!" or "Emergency" and demanding a fast reply are not permitted in the regular forums.

For members who want a higher visibility to their questions, we suggest you post in the Emergency UNIX and Linux Support Forum. This forum is given a higher priority than our regular forums.

Posting a new question in the Emergency UNIX and Linux Support Forum requires forum Bits. We monitor this forum to help people with emergencies, but we do not not guarantee response time or best answers. However, we will treat your post with a higher priority and give our best efforts to help you.

If you have posted a question in the regular forum with a subject "Urgent" "Emergency" or similar idea, we will, more-than-likely, close your thread and post this reply, redirecting you to the proper forum.

Of course, you can always post a descriptive subject text, remove words like "Urgent" etc. (from your subject and post) and post in the regular forums at any time.

Thank you.

The UNIX and Linux Forums

MadeInGermany · April 25, 2019, 12:58pm

I think the order of not-sorted columns is not defined. The usual algorithms seem to sort them to the lowest degree.
GNU sort has the --stable option:

sort --stable -k1,1 -k3,3 input.txt

richa_240889 · April 26, 2019, 12:32am

Tried this .. Getting error that option is not supported

--- Post updated at 04:32 AM ---

I have 12 input files for this requirement .. Do I need to add the number in each file .. even if I do the key will match for multiple rows.

richa_240889 · April 26, 2019, 7:10am

Hi. How to handle this when I have more than one input file

--- Post updated at 11:10 AM ---

HI .. How to handle this case when I have multiple input files

MadeInGermany · April 26, 2019, 8:23am

If you want 12 output files you need to treat them separately, and a loop is appropriate.

for file in input1.txt input2.txt ...
do
  cat -n "$file" | sort -k2,2 -k4,4 -k1,1 | cut -f2- > "$file".sorted
done

richa_240889 · April 26, 2019, 12:03pm

Hi .. Output file is only one and input files are more than one. I need to sort them and merge into one file. When the keys in files matching, the sorting is not working fine . It is sorting based on other columns also which are not part of key. My question is if I add number in each file at the end to solve this, How will it work as in each file number will be added at the end.

--- Post updated at 04:03 PM ---

Hi .. I am able to apply this logic in one input file. In my scenario the input files are more than one . All of them to be sorted and generate one output file. Please help how to achieve this .
I have tried the option of combining the files in one Temporary file and in that temporary file I am applying this logic mentioned by you to generate final output.

Is there any other way.

MadeInGermany · April 26, 2019, 12:15pm

cat -n starts from 1 with each file, so awk with its NR is better:

awk '{print (NR "\t" $0)}' input1.txt input2.txt ... | sort -k2,2 -k4,4 -k1,1 | cut -f2-

Note: the awk inserts a tab-separated field #1, so the sort fields are +1 compared to the fields in the input files. At the end the field #1 is cut off again.

richa_240889 · April 26, 2019, 1:07pm

Can I apply this logic to more than one input file at a time??

--- Post updated at 05:07 PM ---

It is working now ., Thank you for help.