I have many files in a directory similar in below format (in order to understand each group files are separated from others by blank lines ). I want to find duplicate filenames and write them into a new file line by line. I tried several scripts but I couldn't be successful.
You are right. It seems there is no dublicate file name. But actually filenames including BBB and DDD strings are parts of a single file. These are dublicate files for me. These files were created by a conversion program and added some sequence numbers to filenames. Eg.
This code works but is there another way only considering BBB.AHE , BBB.AHN, BBB.AHZ strings? Number of BBB.AHE and others show that those are dublicate files.
Maybe in your script [0"$B" -gt 0 ] part can be modifed but how?
As much as I would like to help, I can't as I don't understand what you want. Show meticulously what input becomes what output and describe the algorithm/logics/reasoning behind it.
In here, the numbers represent to date,time. GR is network code. GAZ, SVRC, MALT and GMLD are station names. BH? or HH? are components. The rest is not important.
As you see, in the first three group, each file (BHE, BHE, BHZ or HHE,HHN,HHZ) contains full data. They are ok for me. But last group contains more than one HHE, HHN and HHZ files. Those are parted by conversion program.
My aim is to find more than one XXXX.HHE (or XXXX.BHE), XXXX.HHN (or XXXX.BHN) and XXXX.HHZ (or XXXX.BHZ) files and list them in a file.
If your flavor of Unix supports uniq with -D option, this should meet your requirement of listing all duplicate file names ignoring the first 26 characters.