Concatenate many files which contents the same date as part of name file

Gents,

I have lot of files in a folder where each file name includes the date of generation, then I would like to merge
all the files for each date in a complete file.

list of files in forder.

dsd01_121104.txt
dsd01_121105.txt
dsd01_121106.txt
dsd03_121104.txt
dsd03_121105.txt
dsd03_121106.txt
dsd04_121104.txt
dsd04_121105.txt
dsd04_121106.txt
dsd05_121104.txt
dsd05_121105.txt
dsd05_121106.txt
dsd06_121104.txt
dsd06_121105.txt
dsd06_121106.txt
dsd07_121104.txt
dsd07_121105.txt
dsd07_121106.txt
dsd08_121104.txt
dsd08_121105.txt
dsd08_121106.txt
dsd09_121104.txt
dsd09_121105.txt
dsd09_121106.txt
dsd10_121104.txt
dsd10_121105.txt
dsd10_121106.txt
dsd11_121104.txt
dsd11_121105.txt
dsd11_121106.txt
dsd12_121104.txt
dsd12_121105.txt
dsd12_121106.txt

output desire files concatenated.

dsd_121104.txt " all files for day 121104 concatenated "
dsd_121105.txt " all files for day 121105 concatenated "
dsd_121106.txt " all files for day 121106 concatenated "

Also, Please how to make a script ,, to make a copy for all the files from one date to a new folder,
I mean to create a new folders.. example folder dsd_121104..
and copy files for day 121104. and the same for the rest of days...

new folders

dsd_121104 "inside all files for day 121104"
dsd_121105 "inside all files for day 121105"
dsd_121106 "inside all files for day 121106"

thanks in advance.

#!/bin/ksh

ls dsd*.txt | awk -F_ ' { print substr($2,1,6); } ' | sort | uniq > dates.list

while read file_dt
do
        mkdir -p dsd_${file_dt}
        for file in dsd*${file_dt}.txt
        do
                cat ${file} >> dsd_${file_dt}/dsd_${file_dt}.txt
        done
done < dates.list

exit 0

If you're running cat 9 times to read 9 files, you don't know what cat's for. You also don't need ls's help to use *. And don't need to use a temp file.

#!/bin/ksh

printf "%s\n" dsd*.txt | awk -F_ ' { print substr($2,1,6); }'  | sort | uniq |
while read file_dt
do
        mkdir -p dsd_${file_dt}
        cat dsd*${file_dt}.txt >> dsd_${file_dt}/dsd_${file_dt}.txt
done

exit 0

[edit] Slight changes to how the filenames are fed to awk.

2 Likes

Corona688, I think ls is required because we don't want awk to read the file, but to list the files and get sub string of file names. Am I correct?

EDIT: never mind saw your modification. Thank you.

I just noted and fixed the difference myself, but used the printf shell builtin instead of the ls external utility.

Thanks Corona688 and bipinajith

I notice that the concatenated file is saved in the created folder.. The main point is to make a copy of all the files for the same date in a new folder... And the concatenated file should be saved in the original folder where are all the files.. Thanks for your help

---------- Post updated 11-23-12 at 05:46 AM ---------- Previous update was 11-22-12 at 01:51 PM ----------

Thanks guys

I have done a little change in your script to like that,, and now is working as I want. thanks again.

printf "%s\n" dsd*.txt | awk -F_ ' { print substr($2,1,6); }' | sort | uniq |
while read file_dt
do
        
mkdir -p folder/dsd_${file_dt}
       cp dsd*${file_dt}.txt folder/dsd_${file_dt}
      
      cat dsd*${file_dt}.txt >> folder/dsd_${file_dt}.txt

done

exit 0

A possible alternative:

printf "%s\n" /tmp/dsd*.txt |
awk -F"dsd" '{FS="_";$0=$0;a[$2]++}END{for (i in a) {cmd="cat /tmp/dsd*_"i" > /tmp/dsd_"i"";system(cmd)}}'
1 Like

Thanks Ripat