Dear all,
I want to list all duplicate files that are present in all subdirectories. I used the following the command and it worked fine.
find . -type f -print | sort -i
This is giving a sample output as follows:
./out
./cas/catch.dat
./cas/File1.dat
./baab/bumber.dat
./baab/File1.dat
./uday/hahah.dat
./uday/samp/CAS/test.txt
./uday/samp/File1.dat
./uday/bhas/File1.dat
./uday/File1.dat
./aaaa/sample.dat
./aaaa/File1.dat
./aaaa/a12/File1.dat
============================
How ever i want to sort this out such that all the duplicate file are listed in a single line. I mean i want to sort based on substring in each line
I want to sort based on substring starting from last occurrence of "/" to the end of the line. is there a staright forward way to do this
thanks in advance
uday
Ygor
March 24, 2011, 2:05am
2
Try...
find . -type f -print | awk -F/ '{print $NF, $0}' file1|sort -i
cgkmal
March 24, 2011, 2:20am
3
Hi tvsubhaskar,
If you only want to show in a single line all files from the sample output:
awk '{sub(".*/",""); sub("^","/");printf "%s ", $0}' inputfile
/out /catch.dat /File1.dat /bumber.dat /File1.dat /hahah.dat /test.txt /File1.dat /File1.dat /File1.dat /sample.dat /File1.dat /File1.dat
If you only want to show in a single only unique files
awk -F"/" '$NF~/\./{R[$NF]=$NF}END{ for(i in R) if(R>1) printf "%s ", "/"i}' inputfile
/hahah.dat /test.txt /bumber.dat /sample.dat /File1.dat /catch.dat
Or only unique files, showing times they appear within ()
awk -F"/" '$NF~/\./{R[$NF]++}END{ for(i in R) printf "%s ", "/"i"("R")"}' inputfile
/hahah.dat(1) /test.txt(1) /bumber.dat(1) /sample.dat(1) /File1.dat(7) /catch.dat(1)
Or only show duplicates files(appear more than once) with how many times appear within ()
awk -F"/" '$NF~/\./{R[$NF]++}END{ for(i in R) if(R>1) printf "%s ", "/"i"("R")"}' inputfile
/File1.dat(7)
Hope it helps,
Regards
while read file; do echo ${file%/*} ${file##.*/} ; done < inputfile | sort -k2 | awk '{printf a==$2?$1"/"$2" ":"\n"$1"/"$2;}{a=$2}'
I guess you want to show duplicate files listed in the same line.
$ awk -F [/] '{++a[$NF];c[$NF]=c[$NF] " " $0} END{for(i in c){ if (a>1){print c,a}}}' input
./aaaa/sample.dat ./sample.dat 2
./baab/bumber.dat ./test/tes/baab/bumber.dat ./baab/bumber.dat 3
./cas/File1.dat ./baab/File1.dat ./uday/samp/File1.dat ./uday/bhas/File1.dat ./uday/File1.dat ./aaaa/File1.dat ./aaaa/a12/File1.dat 7
Hope this helps.
find . -type f | sort -t"/" +2