Processing files using awk

Hi

I have files in our UNIX directory like the below

-rw-r--r--   1 devinfo    devsupp        872 Sep 14 02:09 IMGBTREE27309_12272_11_1_0_FK.idx0
-rw-r--r--   1 devinfo    devsupp        872 Sep 14 02:09 IMGBTREE27309_12272_11_0_0_PK.idx0
-rw-r--r--   1 devinfo    devsupp        432 Sep 14 02:09 IMGBTREE27309_12272_11_0.dat0
-rw-r--r--   1 devinfo    devsupp          0 Sep 14 02:09 IMGBTREE27309_12272_11_0.dat1
-rw-r--r--   1 devinfo    devsupp       8192 Sep 15 13:24 PMLKUP11200_35_0_27813H64.dat1
-rw-r--r--   1 devinfo    devsupp          0 Sep 15 13:24 PMLKUP11200_35_0_27813H64.idx1
-rw-r--r--   1 devinfo    devsupp        872 Sep 15 13:24 PMLKUP11200_35_0_27813H64.idx0
-rw-r--r--   1 devinfo    devsupp        432 Sep 15 12:44 PMLKUP11200_35_0_27794H64.dat0
-rw-r--r--   1 devinfo    devsupp        872 Sep 15 12:44 PMLKUP11200_35_0_27794H64.idx0
-rw-r--r--   1 devinfo    devsupp        432 Sep 15 13:24 PMJNR11200_52_0_27813.dat0
-rw-r--r--   1 devinfo    devsupp        872 Sep 15 13:24 PMJNR11200_52_0_27813.idx0
-rw-r--r--   1 devinfo    devsupp        872 Sep 15 12:44 PMJNR11200_52_0_27794.idx0
-rw-r--r--   1 devinfo    devsupp        432 Sep 15 12:44 PMJNR11200_52_0_27794.dat0
  1. I want only the fields $6, $7, $8, $9 and $5
  2. From the field $9, I do not need the extensions ie., I do not need .idx0, .dat0, .dat1 etc.,
  3. Then group the files based on $9

So finally my output should look like the below

Sep 14 02:09 IMGBTREE27309_12272_11_1_0_FK      872
Sep 14 02:09 IMGBTREE27309_12272_11_0_0_PK      872
Sep 14 02:09 IMGBTREE27309_12272_11_0           432 
Sep 15 13:24 PMLKUP11200_35_0_27813H64         9044
Sep 15 12:44 PMLKUP11200_35_0_27794H64         1304
Sep 15 13:24 PMJNR11200_52_0_27813             1304
Sep 15 12:44 PMJNR11200_52_0_27794             1304

I have this script -

ls -l|awk '{arr[$9]+=$5} END {for (i in arr) {print i,arr}}' 

which will just give the total with extensions. After this I am stuck. Can someone please help me on this?

Don't forget the code tags.

rule1: ls -l |awk '{print $6, $7, $8, $9 , $5}'
rule2: ls -l |awk '{split($9,a,".");print $6, $7, $8, a[1] , $5}' 
rule3: ls -l |awk '{split($9,a,".");print $6, $7, $8, a[1] , $5}'  |sort -k3n
# awk '{split($NF,a,".");x=$6FS $7FS $8FS a[1];if(x!=y){print x,$5};y=x}' file
Sep 14 02:09 IMGBTREE27309_12272_11_1_0_FK 872
Sep 14 02:09 IMGBTREE27309_12272_11_0_0_PK 872
Sep 14 02:09 IMGBTREE27309_12272_11_0 432
Sep 15 13:24 PMLKUP11200_35_0_27813H64 8192
Sep 15 12:44 PMLKUP11200_35_0_27794H64 432
Sep 15 13:24 PMJNR11200_52_0_27813 432
Sep 15 12:44 PMJNR11200_52_0_27794 872

Please use [code] tags when you post code or sample data.

Hi rdcwayx

This script helps a lot and does everything but i also want the bytes to be added in the $9 field. The result of your script:

ls -l |awk '{split($9,a,".");print $6, $7, $8, a[1] , $5}'  |sort -k3n 

comes like the below

Sep 15 13:24 PMLKUP11200_35_0_27813H64 432
Sep 15 13:24 PMLKUP11200_35_0_27813H64 8192
Sep 15 12:44 PMLKUP11200_35_0_27813H64 872
Sep 15 13:24 PMJNR11200_52_0_27813     872
Sep 15 13:24 PMJNR11200_52_0_27813     432

But I want my output like

Sep 15 13:24 PMLKUP11200_35_0_27813H64  9496
Sep 15 13:24 PMJNR11200_52_0_27813      1304

I request you to help.

Thanks.

Ok, you need sum $5 by group.

but with same filename, $6,$7,$8 (date/time) will be different, which one you'd like to keep?

ruby -e 'Dir["*"].each{|x| print "#{File.mtime(x)} #{x} #{File.size(x)}\n"}'

Hi rdcwayx

Then I do not need (date&time)$6, $7 and $8 fields. I just need $5 and $9 fields. From the the below

PMLKUP11200_35_0_27813H64 432
PMLKUP11200_35_0_27813H64 8192
PMLKUP11200_35_0_27813H64 872
PMJNR11200_52_0_27813     872
PMJNR11200_52_0_27813     432

the result should be

Sep 15 13:24 PMLKUP11200_35_0_27813H64  9496
Sep 15 13:24 PMJNR11200_52_0_27813      1304

Please help.

Thanks.

ls -l |awk '{split($NF,a,".");b[a[1]]+=$5}END{for (i in b) print i,b}' 

PMJNR11200_52_0_27794 1304
IMGBTREE27309_12272_11_0 432
PMLKUP11200_35_0_27794H64 1304
IMGBTREE27309_12272_11_0_0_PK 872
IMGBTREE27309_12272_11_1_0_FK 872
PMLKUP11200_35_0_27813H64 9064
PMJNR11200_52_0_27813 1304