Hi all,
I have trouble in finding the multiple word in .txt file. Please help me with any solution.
I have 10,000 .txt files and in each file i have to search specific word but more than one, like
(data, machine learning, clustering) and all these keywords should be case insensitive because inside file same word is in caps and non caps as well. And the main objectives is to copy that file to another folder if the key word is present.
thanking you in advance.
Regards
create a file call it patterns.
machine
learning
data
cluster
cd /path/to/many/files
for fname in *
do
fgrep -Fq patterns $fname
[ $? -eq 0 ] && mv $fname /path/to/another/place/${fname}
done
1 Like
Your requirement is not yet clear.
The following wants at least 2 matching lines, where a match is any of the word1|word2|word3.
For the copy action we put a loop around it.
for file in *.txt
do
if [ -f "$file" ] &&
[ `egrep -ic 'data|machine learning|clustering' "$file"` -gt 1 ]
then
echo cp "$file" /another/folder
fi
done
This might give "too many arguments". Then use find, feeding a read loop:
find . \! -name . -prune -type f -name "*.txt" |
while read file
do
if [ `egrep -ic 'data|machine learning|clustering' "$file"` -gt 1 ]
then
echo cp "$file" /another/folder
fi
done
The
\! -name . -prune
prevents find from visiting sub folders.
1 Like
This should do the trick according to your specs:
cp $(egrep -il 'data|machine learning|clustering' *) new-folder