I have one situation. I have some 6-7 no. of files in one directory & I have to extract all the lines which exist in all these files. means I need to extract all common lines from all these files & put them in a separate file.
Please help. I know it could be done with the help of cut,sort & uniq commands. But it will take more time whenever the script is executed. I want some quick & shortcut method.
Sorry Sir,
My requirement is I want the lines which are present in all these 'n' files. Means every line which will come in output must be present in each of these 'n' files.
>comm_data.txt
for fname in 1.txt 2.txt 3.txt 4.txt
do
if [ ! -s comm_data.txt ]
then
sort $fname > comm_data.txt
else
sort $fname > tmp2
comm -12 comm_data.txt tmp2 > tmp3
mv tmp3 comm_data.txt
fi
done
rm -f tmp2
If the contents of ur files are in order you can use following
>comm_data.txt
for fname in 1.txt 2.txt 3.txt 4.txt
do
if [ ! -s comm_data.txt ]
then
cp $fname comm_data.txt
else
comm -12 comm_data.txt $fname > tmp3
mv tmp3 comm_data.txt
fi
done
cat comm_data.txt
Thanks ripat, it's perfectly working fine. But ,I have a little concern.
Actually, I won't be sure how many no. of files will get generated everytime the script runs. So, I will be storing the no. of files in a variable & in the code -
cat f1 f2 f3 | awk '{a[$0]++} END{for (i in a) if (a[i]==3) print i}'
I think I have to use the value in that variable in place of 3 here. I tried to replace 3 with variable here. But seems to be not working. I even attempted 'awk -v' option. But, in vain.
Can u please quickly help me out in this. The requirement is quite urgent.
I ran both awk solutions and they seemed to work. There is one aspect that may be troubling. If the files contain no duplicates, then all is well. However, here is an example where the trouble might occur. I am using radoulov's code since it is a bit shorter:
#!/usr/bin/env sh
# @(#) user2 Demonstrate finding lines in common.
# ____
# /
# | Infrastructure BEGIN
set -o nounset
echo
## The shebang using "env" line is designed for portability. For
# higher security, use:
#
# #!/bin/sh -
## Use local command version for the commands in this demonstration.
set +o nounset
echo "(Versions displayed with local utility \"version\")"
version >/dev/null 2>&1 && version =o $(_eat $0 $1) awk
set -o nounset
for file in f*
do
echo
echo " -- $file --"
cat -n $file
done
# Use nawk or /usr/xpg4/bin/awk on Solaris.
# | Infrastructure END
# \
# ---
echo
echo " Results from awk:"
filecnt=$( ls -1 f* | wc -l )
awk '
END { for (r in _)
if (_[r] == ARGC - 1)
print r
}
{ _[$0]++ }
' f*
exit 0
Producing:
% ./user2
(Versions displayed with local utility "version")
Linux 2.6.11-x1
GNU bash, version 2.05b.0(1)-release (i386-pc-linux-gnu)
GNU Awk 3.1.4
-- f1 --
1 a
2 b
3 x
4 x
-- f2 --
1 a
2 d
3 y
4 y
Results from awk:
x
y
a
Note that "x" and "y" are not common to the files, only "a". In cases like this, more work would be necessary to ensure that a line was common to all files, and not simply replicated the appropriate number of times in total among some of the files ... cheers, drl