Sort data in text file in particular format

I have to sort below output in text file in unix bash

20170308
DA,I,113
20170308
PM,I,123
20170308
DA,U,22
20170308
PM,U,123
20170309
DA,I,11
20170309
PM,I,23
20170309
DA,U,123
20170309
PM,U,233

New format should be like below values may change sequence of name will remain same

20170308
PM,I,123
PM,U,123
DA,I,113
DA,U,22
20170309
PM,I,23
PM,U,233
DA,I,11
DA,U,123

maybe:

awk '/,/ || !a[$0]++' infile

Wouldn't it be nice to tell people the sort criteria?

2 Likes

Hello Adfire,

As RudiC already mentioned criteria is not mentioned of sorting, so as per your shown first part of output for 20170308 I am considering that we have to sort by 3rd field of digits for each digit ids(or yyyymmdd ). If your Input_file is same as sample shown then following may help you in same.

awk '/^[0-9]/{val=$0;next} {a[val]=a[val]?a[val] ORS $0:$0} END{for(i in a){print i;system("echo " s1 a s1 " | sort -t, -k3nr")}}' s1="\""   Input_file

Output will be as follows.

20170308
PM,I,123
PM,U,123
DA,I,113
DA,U,22
20170309
PM,U,233
DA,U,123
PM,I,23
DA,I,11

EDIT: Adding a non-one liner form of solution too here.

awk '/^[0-9]/{
val=$0;
next
}
{
  a[val]=a[val]?a[val] ORS $0:$0
}
END{
  for(i in a){
    print i;system("echo " s1 a s1 " | sort -t, -k3nr")
}
}
' s1="\""  Input_file

Thanks,
R. Singh

Just guessing what might be the sort criteria, this gets you quite near your desired output:

awk '
!T[$0]++ && !/,/        {FN = $0
                         if (CMD) close (CMD)
                         CMD = "sort -t, -k1,1r -k2,2 - > " FN
                        }
$0 != FN                {print | CMD}
' file ; for x in 20*; do echo $x; cat $x; done
20170308
PM,I,123
PM,U,123
DA,I,113
DA,U,22
20170309
PM,I,23
PM,U,233
DA,I,11
DA,U,123
1 Like

Hi thanks for your reply it's working now.Now i need to add sum (Total).Example below

20170308
PM,Total,246
PM,I,123
PM,U,123
DA,Total,135
DA,I,113
DA,U,22
20170309
PM,Total,246
PM,I,23
PM,U,233
DA,Total,134
DA,I,11
DA,U,123

Hello Adfire,

Please always use code tags as per forum rules for your commands/codes/Input_files, could you please try following and let me know if this helps you.

awk -F, '
val && /^[0-9]+/ && val !~ $0{
    print val;
    for(i in b){
       if(a){
         print c,a;
         delete a
};
       print b
};
    delete b;
    delete c
}
/^[0-9]+/{
    val=$0;
    next
}
{
    c[$1]=$1 FS $2;
    a[$1]+=$3;
    b[$1]=b[$1]?b[$1] ORS $0:$0
}
END{
    print val;
    for(i in b){
       if(a){
         print c,a;
         delete a
};
    print b
}
}
'   Input_file
 

Output will be as follows.

20170308
PM,U 246
PM,I,123
PM,U,123
DA,U 135
DA,I,113
DA,U,22
20170309
PM,U 256
PM,I,23
PM,U,233
DA,U 134
DA,I,11
DA,U,123
 

Thanks,
R. Singh

Hi Ravinder
Could you please explain the code if possible? Thanks in advance.