How to sort the timestamp in the filename in shell script?

feilhk · March 4, 2014, 10:47am

originally the shellscript

#ln_file_name=`echo $ld_interface_date"_"${8}".csv"`
#ln_file_name=`echo 201202011527_HL_HLTM1_B04A.csv`
ln_file_name="*"`echo ${7}".csv"`
get_file_list_1=$log_path"tm1_file_list.gfl1"

        cd ${source_path}
        echo "Try to find any file exist in the source_path"
        ls $ln_file_name > $get_file_list_1 
         # Count the number of files exist in source_path
     if ( test -s $get_file_list_1 )

how do i change this code "ls $ln_file_name > $get_file_list_1 "to sort the
timestamp in fielname?

the fielname format "alex_li_20140304102033_TM1" 20140304102033 is the timestamp

for example
alex_li_20140304102033_TM1.csv

BB_li_20140304112033_TM1.csv

BX_li_20140304082043_TM1.csv

BC_li_20140304202033_TM1.csv

after the sort it will be the oldest file list first

BX_li_20140304082043_TM1.csv

alex_li_20140304102033_TM1.csv

BB_li_20140304112033_TM1.csv

BC_li_20140304202033_TM1.csv

shamrock · March 4, 2014, 11:51am

The ls command will sort the files only if the timestamp of those files is the same as that embedded in the filename otherwise you'd have to use awk or perl for sorting on that specific part of the filename...

feilhk · March 4, 2014, 11:58am

ya i am trying to sort the specific part of the filename, for the last modified date timestamp i am able to sort this

i tried ls -t | awk -F '_' '{print $3}'

it will output as 20140304202022

how do i output as alex_li_20140304202022_TM1.csv to remain format for

"ls $ln_file_name > $get_file_list_1 ", otherwise , only output timestamp will make my program fail lol

and still able to be sorted?

thanks :o

shamrock · March 4, 2014, 1:59pm

Try this awk script...

ls -1t *.csv | awk -F"_" '{
    x[NR] = $3
    y[$3] = $0
} END {
    for (i=1; i<NR; i++)
        for (j=1; j<(NR+1-i); j++)
            if (x[j] > x[j+1]) {t = x[j]; x[j] = x[j+1]; x[j+1] = t}
    for (i=1; i<=NR; i++) print y[x]
}'

Chubler_XL · March 4, 2014, 3:39pm

Try this

ls *.csv | sort -t_ -k3,3n

shamrock · March 4, 2014, 4:06pm

Gosh what was i thinking :o as yours Chubler_XL is the best...

feilhk · March 4, 2014, 10:46pm

thanks for your answer, it work, but i discover that

the format is a bit different for users,

alex_li_20140304121212_tm1.csv

alex_cf_li_20140304121212_tm1.csv

there might be some user with extra underscore.

so counting from suffix is better, the time stamp si always before tm1.csv, how do i change ls *.csv | sort -t -k3,3n to count the segment before _tm1.csv ?

Chubler_XL · March 4, 2014, 11:01pm

Probably best to use sed to prepend the numeric value, sort and then cut the number:

ls *.csv | sed 's/.*_\([^_]*\)_[tT][mM]1.csv/\1 &/' | sort -n | sed 's/^[^ ]* //'

shamrock · March 4, 2014, 11:12pm

You could try my awk script here...

ls -1t *.csv | awk -F"_" '{
   x[NR] = $(NF-1)
   y[$(NF-1)] = $0
} END {
   for (i=1; i<NR; i++)
       for (j=1; j<(NR+1-i); j++)
           if (x[j] > x[j+1]) {t = x[j];x[j] = x[j+1];x[j+1] = t}
   for (i=1; i<=NR; i++) print y[x]
}'

feilhk · March 4, 2014, 11:23pm

i am new to shell script, coudl you help explain a little about the code, i have no idea how it works ..

pravin27 · March 5, 2014, 6:00am

ls *.csv | awk '{match($0,/[0-9].*_/); printf $0"\t"; print substr($0,RSTART,RLENGTH-1)}' | sort -nr | cut -f1