Little
August 7, 2013, 7:10am
1
hi,
i have a directory which contains some files and a subdirectory. i am writing only the files names to a file using the below code.
ls -ltr | grep "^-" | awk '{print $NF}' > /home/file_list$$
cat /home/file_list$$
s1_abc.txt
s2_def.xls
s3_def.xls
as you can see there is one .txt file and 2 .xls file.
i want to keep any one file for each extension. i.e i want to have unique file for each extension. so the
will have only the following contents
s1_abc.txt
s2_def.xls
how can i do this?
Hello,
Could you please use the following code.
Lets say here is the file with details.
$ cat remove_duplicate_extension
s1_abc.txt
s2_def.xls
s3_def.xls
s4_def.txs
s5_def.txt
s6.def.excel
s7.def.lxs
Here is the script for same.
$ cat remove_duplicate_extension.ksh
#value_check_b=0
while read line
do
a=`echo $line`
value_check_a=`echo $line | awk -F. '{print$2}'`
#echo $value_check_a
if [[ $value_check_a == $value_check_b ]]
then
echo ""
else
echo $line
fi
value_check_b=`echo $line | awk -F "." '{print$2}'`
done < "remove_duplicate_extension"
Output should be as follows.
$ ksh remove_duplicate_extension.ksh
s1_abc.txt
s2_def.xls
s4_def.txs
s5_def.txt
s6.def.excel
Please let me know if this helps.
Thanks,
R. Singh
Little
August 7, 2013, 7:58am
3
but in your output you are getting two .txt files.
i want a single file with each extension regardless of the file names.
perl -e 'opendir(DIR , $ARGV[0]); chdir $ARGV[0]; while (readdir DIR ){($f)=$_=~/\.([^\.]+)$/;if ((-f $_ )&& (! defined $seen{$f})){$seen{$f}++; print "$_\n";}}' $DIRECTORY
sed 's/\(.*\)\./\1%/g' file_list | sort -t'%' -u -k2 | tr '%' '.'
sample below uses the same list as ravinder ... not as elegant as balajesuri's code but should be easy enough to understand ...
[root@centosgeek ~]# for ext in $(awk -F"[._]" '{print $NF}' testfile2 | sort -u);
> do
> grep $ext testfile2 | head -1
> done
s6.def.excel
s7.def.lxs
s4_def.txs
s1_abc.txt
s2_def.xls
[root@centosgeek ~]#
$ cat a.txt
s1_abc.txt
s2_def.xls
s3_def.xls
s4_def.txs
s5_def.txt
s6.def.excel
s7.def.lxs
$ awk -F\. '!a[$NF]++' a.txt
s1_abc.txt
s2_def.xls
s4_def.txs
s6.def.excel
s7.def.lxs
1 Like
itkamaraj:
$ awk -F\. '!a[$NF]++'
Your simple code could even have an easy variation as well. Simply removing ! would print those that are to be excluded instead. This is useful if they are to be deleted.
awk -F. 'a[$NF]++' a.txt