Dear all,
what I need to do is extract an entries list from a file and remove some entries based on a white list present on other file, then output into result.txt file.
Example:
source.txt:
12345 text1 text2 text3 text4
123 text1 text2 text3 text4
678 text1 text2 text3 text4
987 text1 text2 text3 text4
456 text1 text2 text3 text4
whitelist.txt
123
987
output on result.txt file:
12345 text1 text2 text3 text4
678 text1 text2 text3 text4
456 text1 text2 text3 text4
What is the best and fast way to do that?
I can change the CR in whitelist.txt and put a "," like:
123,987
if this can simplify the code...
Many thanks!
agama
June 23, 2012, 4:56pm
2
Have you tried using grep?
grep -v -f whitelist.txt source.txt >result.txt
Does't work:
[root@localhost ]# grep -v -f whitelist.txt source.txt
678
456
BTW, is slightly more complicated. I've update the first post, since in the source file there's some other datas....
Slight modification to agama's suggestion.. Try:
grep -vxf whitelist.txt source.txt
--edit--
OK, I see the original post got changed in the mean time...
Try:
grep -vwf whitelist.txt source.txt
But that could still go wrong if a number is present in the bla bla after field 1. So this would be safer:
awk 'NR==FNR{A[$1]; next}!($1 in A)' whitelist.txt source.txt
--
On Solaris use /usr/xpg4/bin/awk rather than awk
2 Likes
Scrutinizer, you're absolutely the best, it works perfect!!
[root@localhost ]# awk 'NR==FNR{A[$1]; next}!($1 in A)' whitelist.txt source.txt
12345 text1 text2 text3 text4
678 text1 text2 text3 text4
456 text1 text2 text3 text4
Do you think is it possible to make something like this? :
awk 'NR==FNR{A[$1]; next}!($1 in A)' whitelist.txt source.txt | while read line; do
echo $line | awk '{printf $1}'
done
What I need to do is extract values from already filtered values (eg: 1st one) line by line and create another output file like:
blabla 12345 text
textx 678 some
texty 456 try
OK, I can output into another file the first AWK, and the loop into that file, but is there any other way using directly your AWK command ??
You can do this, which will provides only the first fields and take it from there:
awk 'NR==FNR{A[$1]; next}!($1 in A){print $1}' whitelist.txt source.txt
1 Like
scrutinizer:
You can do this, which will provides only the first fields and take it from there:
awk 'NR==FNR{A[$1]; next}!($1 in A){print $1}' whitelist.txt source.txt
It works like the previous one, obviously with only the first field.
The problem is that I cannot add more text to output, like this:
awk 'NR==FNR{A[$1]; next}!($1 in A) blabla {print $1} text' whitelist.txt source.txt
Expected result:
blabla 12345 text
blabla 678 text
blabla 456 text
Add additional text like this:
awk 'NR==FNR{A[$1]; next}!($1 in A) { print "blabla", $1, "text"}' whitelist.txt source.txt
1 Like
chubler_xl:
Add additional text like this:
awk 'NR==FNR{A[$1]; next}!($1 in A) { print "blabla", $1, "text"}' whitelist.txt source.txt
Thanks it works perfect, and it was so simple!!! :rolleyes: